CatalogueOfLife / data

Repository for COL content
7 stars 2 forks source link

ZOBODAT Vespoidea - genera possibly in wrong family #421

Open ManonGros opened 2 years ago

ManonGros commented 2 years ago

Describe the problem: Issue originally reported on the GBIF feedback portal: https://github.com/gbif/backbone-feedback/issues/263 by @PatrickKennedy2022

Patrick noticed that the following genera are placed in the solitary wasp family Eumenidae when they should be in the Vespidae family:

Link to effected CoL webpages:

Literature references: No reference provided

ManonGros commented 2 years ago

Keep in mind that there is another issue currently open concerning the Polistes genus: https://github.com/CatalogueOfLife/data-vespoidea/issues/1

yroskov commented 2 years ago

As I mentioned before, ZOBODAT Vespoidea is frozen dataset and, unfortunately, cannot be updated in CoL.

However, if @PatrickKennedy2022 can kindly offer the global checklist for Vespoidea as an alternative to ZOBODAT, we'll be happy to include it in the CoL.

mdoering commented 2 years ago

@dhobern @yroskov how do we plan to work with this group in the near future then? We need a way to still improve the data

bdagley commented 1 year ago

I'm reposting the same comment I added on a related open issue:

The COL page taxonomy is incorrect, since Eumeninae has been treated as a subfamily for years, and also because COL lists at least three genera that never were in Eumeninae/dae: Belonogaster, Polybia, and Polistes (family Vespidae: subfamily Polistinae). The prevailing taxonomic view currently considers Vespidae a family (within superfamily Vespoidea), and all Vespidae or 'Eumeninae' taxa falling into the following subfamilies: Vespinae, Eumeninae, Polistinae, Stenogastrinae, and Masarinae. Carpenter (1986) listed all the genera in Eumeninae, and Vecht and Carpenter (1990) listed all Vespidae genera. The iNaturalist Vespidae taxonomy should also be correct, although doesn't yet list all genera because it focuses on observed photos.

Although @yroskov you asked for a global checklist alternative, so do you mean you need a website? The following are notes on websites. EOL probably won't work because it merely lists all Vespidae genera under Vespidae with no subfamilies (https://eol.org/pages/5242/names), and also seems to have multiple separate "Vespidae" pages. Discover Life only lists certain Vespidae genera, and uses the outdated family Eumenidae for subfamily Eumeninae, although doesn't mis categorize Polistes (correctly places it in Polistinae). ITIS probably also has similar limitations. The iNaturalist taxonomy linked above could work, if possible for COL to use iNaturalist.

One consideration about Vespidae is that it includes numerous species, making it unlikely that any one website source includes the complete taxonomy. Due to this, I wonder if the above checklist publications may need to be used to manually assign some genera to subfamilies. I could also volunteer to indicate which subfamily a list of unassigned genera belong to, if helpful.

yroskov commented 1 year ago

Although @yroskov you asked for a global checklist alternative...

CoL can replace one global checklist with another. We cannot do alterations inside chosen checklist.

bdagley commented 1 year ago

Okay, so the only I can think of are either to use iNaturalist's taxonomy (at least we can correct and expand that one, I'm a curator), or to use EOL if someone were to add the subfamilies etc. there, if possible. I just checked ITIS and they're also outdated. Now, in the event iNaturalist taxonomy can be used, I would first then work on refining and completing that taxonomy within iNaturalist before suggesting it be used here yet.

mdoering commented 1 year ago

@yroskov @dhobern if Vespidae is a frozen dataset we could create a new GSD within ChecklistBank and merge informations from other lists into a new project which could act as a new global resource for COL. It would just need at least one person that drives and maintains this in the future...

mdoering commented 1 year ago

We do have iNats taxonomy in CLB btw, its being updated regularly. Here are the wasps: https://www.checklistbank.org/dataset/139831/taxon/https%3A%2F%2Fwww.inaturalist.org%2Ftaxa%2F52747

mdoering commented 1 year ago

If I read iNat correctly it is sourced from GBIF and NatureWatch NZ? https://www.inaturalist.org/taxa/52756/schemes

dhobern commented 1 year ago

In the absence of an actual dataset we can readily use and reference as authoritative, would it be simplest to graft the Zobodat Eumenidae into COL as Vespidae: Eumeninae? We need to be moving towards subfamily and tribal classification in the hyperdiverse insect orders. Clearly, it's not satisfactory that none of the other subfamilies is yet included (and I expect that some genera will still be wrongly placed between Eumeninae and other Vespidae) but it would be more correct than we have today.

bdagley commented 1 year ago

If I read iNat correctly it is sourced from GBIF and NatureWatch NZ? https://www.inaturalist.org/taxa/52756/schemes

That page is only for one genus, but seems to be as you said. Also for iNaturalist wasp taxonomy as a whole we sometimes also later make manual taxon changes called deviations. So I would assume each of the Vespidae subfamilies on iNat. have also been affected by deviations.

bdagley commented 1 year ago

Also just wondering why the Zobodat database is frozen on COL.

mdoering commented 1 year ago

iNaturalist does not use authorships for their names which is a blocker for using it in COL:

https://www.checklistbank.org/dataset/139831/names?TAXON_ID=https%3A%2F%2Fwww.inaturalist.org%2Ftaxa%2F52747&facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&offset=0&rank=species&status=accepted&status=provisionally%20accepted

Otherwise here is a diff of the names under Vespoidea from Zobodat and iNat. Quite some difference:

image image
mdoering commented 1 year ago

Looking just as Vespidae this would be in iNat and Zoobot:

image image
mdoering commented 1 year ago

How about the following options for a quick solution:

yroskov commented 1 year ago

Also just wondering why the Zobodat database is frozen on COL.

ZOBODAT at the Biologiezentrum, Landsmuseen, Linz, Austria contains a digital version of the Vespoidea monographic treatment by Dr Josef Gusenleitner. Vespoidea checklist joined the Catalogue of Life via ZOBODAT in 2006, after Gusenleitner's retirement. As I understood, since that time, data were not changed due to absence of the taxonomic expert in the Landsmusee.

bdagley commented 1 year ago

I understand some of this although am not yet as familiar with curating GBIF, so feel free to further clarify.

iNaturalist does not use authorships for their names which is a blocker for using it in COL

If this is replying to my question about using iNaturalist for GBIF taxonomy, I'd wondered if GBIF could use iN taxonomy directly rather than use iN taxonomy via COL using iN. For general taxonomy outside of Hymenoptera, I think iN most often receives taxonomy from EOL, or COL if not. Also re: iN not using names, that is true although they import taxa from sources that do use names.

There are fewer authoritative online databases in general for Hymenoptera (wasps and bees) compared to other wildlife groups. iN says it uses Discover Life and BugGuide secondarily as bee/wasp authoritative sources, although I'm unsure where iN originally imported most of it's Hymenoptera taxonomy from. Vespidae on iN cites NatureWatch NZ and "GBIF (4990)" as taxon schemes (https://www.inaturalist.org/taxa/52747/schemes), and the taxon pages for each subfamily cite GBIF (3990) as the only scheme (e.g. https://www.inaturalist.org/taxa/119344/schemes). I also know that since that time, iN curators have made a few taxon changes that "deviate" from schemes, such as adding genera that weren't included as well as synonymizing/merging genera. Another confusing matter is that we're discussing errors in GBIF taxonomy that aren't included on iN and yet iN uses GBIF as the scheme.

Also to clarify, when I asked about using iN taxonomy for GBIF, I assumed iN taxonomy might not be ready to use yet due to being incomplete (compared to the full classification in taxonomic literature) so I'd need to update/complete iN taxonomy first if it were to be used. Unless for some reason that taxonomy wouldn't need to be complete in listing all taxa, or if GBIF could combine multiple sources including iN to form a taxonomy. Although now it sounds like iN might not be possible to use at all.

In any event, I wonder how and would prefer that ZOBODAT could be fixed, since it seems to be affecting multiple databases. I don't remember what year Eumenidae became Eumeninae in literature, but am fairly sure that the polistine genera weren't in Eumeninae/dae in 2009, so there seems to be some general errors in ZOBODAT. It would be best if someone could help fix that in the Landsmusee, even including a volunteer who we or I could advise via email if necessary. Regardless of what steps are taken, if helpful I can also verify or find complete genus checklists for all subfamilies of Vespidae.

dhobern commented 1 year ago

Thanks @bdagley - for anyone to maintain a high-quality taxonomic checklist is a massive amount of work. There seems no way that the Vespoidea will once again be maintained by Zobodat.

Luo et al. 2022 (https://doi.org/10.3390/insects13060529) and Piekarski et al. 2018 (https://doi.org/10.1093/molbev/msy124) give support for some of the changes that are required, but will not get us close to positioning all genera correctly, let alone dealing with synonymy, etc.

I'll raise the superfamily for discussion at the Taxonomy Group meeting, but I doubt we'll have a quick fix. What would be most useful would be if any hymenopterist has a current near-complete classification for the genera.

bdagley commented 1 year ago

Thanks. In the meantime, I'll try to create at such a checklist, and potentially ask questions or provide my checklist for review to taxonomists such as James Carpenter, who has written checklists, and others. This family is actually one of my primary focuses for identification on iNaturalist and Bug Guide, along with Crabronidae, Sphecidae, and Anthophila/bees. So, I'd also like to create a full database of genera for iNaturalist alone, whether or not it's used on GBIF. I will provide an update here in later weeks on this.

mdoering commented 1 year ago

Thanks @bdagley, if you would produce a full global lists of the superfamily or some of its families we would be glad to include it in checklistbank.org and use it for the COL checklist if it appears to be of better quality than the current ZOBODAT one.

As a simple start in fixing small errors I have placed the current COL content of Vespoidea in github files for editing if we think that is a solution for now at least: https://github.com/CatalogueOfLife/data-vespoidea

bdagley commented 1 year ago

Thanks @mdoering. I can help with this although am just learning how to work with file types including ColDP so may need to view some examples of how to make the changes. Because the superfamily has numerous species, it may be best to start with the current COL content of Vespoidea and make small changes to it, rather than rebuilding the entire taxonomy. Unless there are easier ways to do that, then I could consider rebuidling it. I'm not yet familiar with editing these type of files, although could try it.

One major thing for family Vespidae on GBIF or COL, is that many genera are currently under family Vespidae but the rest are under family "Eumenidae," which is supposed to be a subfamily of Vespidae, Eumeninae. There are a few more than 3 subfamilies that are supposed to be in Vespidae, although the main ones are Vespinae (not added), Polistinae (not added), and Eumeninae (added, as a family).

So, it would be good to estimate or plan if GBIF/COL will ever be able to add ranks for Vespidae subfamilies. For families such as Vespidae, and also for Pompilidae (spider wasps), bees, etc., subfamilies are very useful because their diversity makes for a very long list of genera otherwise. Alternatively, if subfamilies won't be used, it would be better to move the genera under family Eumenidae to be under family Vespidae.

mdoering commented 1 year ago

If we can get the subfamilies sorted COL will definitely adapt these, yes. I could help you to get started on this. It would basically just changing the parentID column of all genera in the NameUsage.tsv file to point to the correct (newly added) subfamily ID. That seems doable to me as a first step and I am happy to add the required subfamilies so you would just have to change the parentID values. If we wanna give this a try I would suggest to keep all future discussion and issues in the data-vespoidea repository.

mdoering commented 1 year ago

GBIF currently still cannot handle any other than major Linnean ranks, so subfamilies won't show in the backbone for some time at least. But COL has embarked on more ranks and richer classifications as needed in the different groups since over a year.

bdagley commented 1 year ago

This sounds like a good plan, although if we assign all the genera (and child taxa species) to subfamilies for COL, will anything change on GBIF yet? If not, I'd still like to fix COL, although I'd hope the GBIF classification wouldn't become even more imperfect than it is now. Currently GBIF has some species in Vespidae, some in family "Eumenidae." There are also additional errors to fix from ZOBODAT, including duplicates of some genera, where many of the duplicate genera taxa are mispelled, or have no children taxa, or have no occurrence records for a genus or it's species. So, I wonder if we could simply delete any of those, if we notate that it's a deviation from ZOBODAT.

I agree we should keep the discussion and issues in the data-vespoidea repository. If you view issues I created, I created over 10 yesterday regarding different vespid genera, and there are some from before that. I could move those to the data-vespoidea repository if I'm able to with my account. If I'm not able to, I wonder if my account could be updated to allow that.

These are the correct Vespidae subfamilies (not from any partcular website): Vespinae, Polistinae, Eumeninae, Euparagiinae, Masarinae, and Stenogastrinae. Some sources also list two extinct subfamilies, Priorvespinae and Protovespinae, although I'm unfamiliar with those.

mdoering commented 1 year ago

Sounds like a plan. Let's move issues and discussion over and I'll start simplifying the raw files a little so they are better suited for editing.

bdagley commented 9 months ago

I have a few further thoughts on this, going back to the beginning of this discussion. This is a long message, but relates to numerous existing taxonomic errors and a complicated issue that I can't explain in less words. It remains confusing how there are so many errors in the Zobodat database. For example, what seem to be multiple mispelled versions (not once-accepted past synonyms) of genera such as Ancistrocerus. This is true of essentially every of the very many vespid genera in each subfamily, and many of the genera are in the wrong subfamilies. Their inaccurate placement in the wrong subfamilies also doesn't seem to correspond to a past once-accepted classification, but to be due to mere errors.

Commenters here have mentioned that it would be ideal in the future to include tribal vespid classification, and to correct that Eumenidae (a once-accepted family) has for a long time now become considered a subfamily, Eumeninae, and to make the other noted corrections. One side note is that the tribal classification of Eumeninae isn't yet fully determined even in the taxonomic literature, so no tribes should be added yet. Similarly, it's probably unnecessary to add vespid subgenera at this time, which often aren't widely used anyway. In order of priorities, the main errors that seem ideal to fix first are moving the genera to the correct subfamilies and correcting all the genus-specific and species-specific errors.

I volunteered to help with this earlier, and have helped to at least document and discuss some of the issues. However, as many are probably aware, Vespidae (and Vespoidea) includes a significantly large number of genera and species. And, we can now multiply that number of taxa since many of the genera and species on GBIF are duplicated multiple times, or are mispelled or no longer accepted synonyms (including being members of different genera than they currently are in literature). For these reasons, fixing all of these errors manually no longer seems feasible for me and/or for GBIF developer(s) to complete, to me.

This brings the issue back to the Zobodat database. It's been said that the database is frozen and that no one there can fix it. In the event that the people who created the Zobodat database are still living and working, it would seem ideal to at least explain the issue to them, also even for the sake of them noting that their own website database (in addition to the database shown on GBIF) should ideally be fixed if/when possible, or annotated/flagged to notify viewers that it contains errors. Technically, the cause of the problem isn't due to GBIF/COL employees or volunteers. I also don't understand if the Zobodat database is permanently frozen or why no one associated with Zobodat could potentially help to fix it. Are there any current Zobodat members/employees or volunteers, and if not will there be ones in the future? This really seems like something they, if anyone, should fix.

The other possibility commenters here mentioned would be for GBIF/COL to use a different Vespoidea database source. We established that iNaturalist's can't be used because they don't use authority names and dates, and because while they currently include many of the vespoid taxa in their taxonomy they don't include all of the taxa currently shown on GBIF from Zobodat. I'd be open to using a different, third, taxonomic source instead, if it would work to do so.

However, in the event that the long term situation or plan will end up being to continue to use the frozen Zobodat database, for one reason or another, it seems that the conclusion should be that someone associated with Zobodat or who becomes associated with it in the future should ideally fix it, and fix their own database. If GBIF developer(s) or I were instead to manually work on fixing it here, it simply would take far too long to be feasible, and would require extensive additional researching of the literature to understand what additional not yet documented changes need to be made. If anyone, such as volunteer(s), were to work on it, it would either need to be a full time project (which would probably need to be funded) or else would take years or longer to complete if it were only worked on part time or sporadically.

In the event that nothing will change, or that nothing will change for a very long time, I also think it would be ideal if possible to add some kind of disclaimer or flag to the Vespidae/Vespoidea taxonomy shown on GBIF to indicate that it contains errors. Otherwise, authors of publications may publish errors based on it. For example, if a "!" symbol, or red text, could be displayed on or next to the names of duplicated or mispelled genera, especially ones that contain no occurrence records.

Feel free to discuss or agree or disagree. I have also heard other external vespid taxonomists be critical of the way the GBIF taxonomy (which is actually due to Zobodat) is currently displayed with many errors. Also, Vespidae is one of the most well-known group of wasps, what some consider "typical wasps," somewhat comparable to how bees are also so well-known among other insects, so I'm not merely raising these issues out of a personal interest in an obscure taxonomic group. Lastly, as I hope is clear, I'm not blaming anyone for the problems or saying that anyone must fix them, only documenting the errors and discussing what could potentially be done. I actually suspect that maybe it will be impossible to fix the issue, unless someone associated with Zobodat or who in the future becomes associated with Zobodat fixes them, or unless funded individual(s) (and it probably would require multiple people) somehow were to work on it here full time.

yroskov commented 9 months ago

...other vespid taxonomists be critical of the way the GBIF (actually due to Zobodat) taxonomy is currently displayed with many errors. Also, Vespidae is one of the most well-known group of wasps...

Thank you @bdagley for your assessment of the problem with available data. Indeed, it would be great if Vespid taxonomists would come together for a project, find funds, and create an authoritative global Vespid checklist, that could replace the ZOBODAT in CoL. Perhaps, you may talk to your colleagues about this. I am sure, both GBIF and CoL will endorse such valuable efforts.

DaveNicolson commented 9 months ago

An entomologist with ITIS (Daniel Perez) has been chipping away at an attempt to complete a global update for Vespidae, although there is a lot still to do before we can load it into ITIS (whereupon it can be easily compared against expectations & other lists). As has been noted, it is very difficult to "complete" such work in the absence of comprehensive & recent published catalogs/checklists. Nevertheless, at some point we will need to release it (once it goes through our internal QA/QC processes).

As of right now, the file contains 7346 new (to ITIS) scientific names (all with authorship and year), including 5215 valid species (and 1403 subspecies, including nominotypical) in 268 valid genera. There are no tribes in it, just subfamilies (including Eumeninae), although Daniel notes that where the tribal allocations of genera are known and not too unstable, it would not be hard to add them.

This is still very much in draft form, as Daniel notes that there are still sure to be some names found in 2 genera (to be detected and reconciled), and some still need to have a supporting reference added, etc. (not to mention that there are surely some names that were missed, which is par for the course when compiling without the benefit of recent catalogs).

This doesn't help for the Vespoidea outside of Vespidae, but that's what we have (in the pipeline) for now. If there is anyone interested, I'm sure we could share lists generated from the working file, for simple assessments.

bdagley commented 9 months ago

Thanks for your comments, I may agree as far as I understand the proposals, although am less familiar with some of the other databases. As for me, my main focus is Vespidae specifically re: Vespoidea, and also reminding that we're referring to a global checklist of taxa, although I assume everyone already knows and implied that. Re: tribal classification, if it ever were to be used I'd agree with a plan to categorize the genera that are known to be members of specific tribes as members of those tribes in the website taxonomy, such as Eumenes belonging to Eumenini. The use of these tribes will help organize vespid taxonomy because otherwise it's an overly long list of genera. As a side note, the current tribe Zethini is suspected to become revised to become a new subfamily in the future, Zethinae, although that requires further phylogenetic studies that could be many years away.

bdagley commented 7 months ago

I've gone back and forth on my thinking about what to do about this problem.

To clarify, is or was anyone able to confirm if any future employee or volunteer associated with zobodat can un-freeze the zobodat taxaonomy and (presumably) do most of the work on their end to fix it? Because in some ways, that might be the most efficient and ideal way to fix Vespidae and Vespoidea, since most of the individual genera and species are already added (it's just that some are mispelled, duplicated, made members of the wrong subfamily, etc.).

Re: the ITIS work @DaveNicolson said Daniel Perez is chipping away at, does anyone know what percent of the true total number of Vespidae species the current number added/being added to ITIS represents? Or, at least how the percent compares to the number of Vespidae species currently on GBIF. Yet, the total on GBIF is somewhat overestimated, due to duplicates, mispellings, "spec." being listed as if they're species names, etc. My potential concern would be if the ITIS number of species, or the number of species from other sources commenters suggested, remains far from the true total number of vespid species. In which case, it could potentially be unfeasible to complete those projects, at least to use them for this GBIF function.

I also wonder if I'm correct to assume that any major change (to Zobodat, or involving other databases potentially replacing it) would only be expected to be implemented, if ever, years, possibly many years from now? With regard to how "feasible" it is to fix and/or correct a given Vespidae taxonomy (e.g., like the Zobodat one and ITIS one), I also assume that may differ, depending on if different platforms are faster or easier to revise and/or have more people working on them.

Regardless, I'm somewhat assuming that the zobodat database will remain at least for awhile, probably years. I haven't yet actually used the tools to edit the taxonomy that were mentioned to me (e.g. using Git), but I could possibly try to correct merely one genus to start with (Delta), to see how long that takes me, or if anyone would be able to help with part of it. I'm currently writing a co-authored manuscript about global Delta species distributions, hence my primary interest in that genus. In similar fashion, it could possibly be helpful for vespid authors when publishing new articles or catalogues about particular genera to attempt to fix the current taxonomy for that given genus. In that way, some or many of the "main" (most well known and abundand) genera and species could become corrected. That could be useful in the meantime of years where this taxonomy currently seems unlikely to change (if another database eventually replaces it, or if a future zobodat-associated person begins to fix their database on their own), or would be just generally useful if the current zobodat database is never altered or replaced again (aside from the corrections I just suggested) or if a future zobodat-associated person begins to fix their database on their own.

Currently, I'm also busy so am unsure how long me attempting to learn how to fix the Delta genus, and then fixing it, would take me. In the meantime, I'd also mentioned that it may seem unnecessary or too time-consuming in some ways to keep adding individual Vespidae genus or species Issues. But on the other hand, those currently function as notices (to at least people who can and do view them) that there are problems with the taxa those Issues relate to. Personally, it's unlikely that I'll add many more Issues at a regular basis like I did before, but I may occasionally add any new ones I find, if that's considered acceptable.

Finally, if I were to reference GBIF's taxonomy in a Vespidae publication or to a Vespidae museum taxonomist/curator, I'd suggest that they mostly search by individual genera and species, since most of the species themselves do have "correct pages" (despite that some also have duplicated pages). In other words, the best current way for a user to view the taxonomy (if they don't want to encounter any errors) is to search directly by genus or species. Or, they can click on genus to see the checklist of species, but again, that will include some inaccurate, mispelled, or duplicate taxa.

Feel free to share thoughts.

DaveNicolson commented 7 months ago

I've asked Daniel Perez to summarize the status of his work re Vespidae (numbers of valid genera and species, a sense of where he is in process and when he might expect to have it ready to load into ITIS etc.), and will report back.

DaveNicolson commented 7 months ago

Daniel summarized the work as follows:

"Valid Genera – 270 - 31 invalid Valid Subgenera – 28 - 1 invalid Valid Species – 5,328 - 297 invalid Valid Subspecies – 1,404 - 134 invalid 69 taxa without references, plus a few missing authors or missing dates of description. Some reconciling also needed. Only 31 species/subspecies are missing distributions. 511 pdfs have been used."

He indicates "The file could be advanced a little more by starting proofing now [that is, passing it for cross-checking to another ITIS staffer, we do this twice for each project] but probably could not be taken to completion without some additional information."

bdagley commented 7 months ago

The ITIS numbers of vespid taxa above seem correct or nearly so (and it seems they're still being added to) for valid genera and species (I don't know for subspecies). Although it was from 2013, Anguir et al. (2013) wrote: Family Vespidae Latreille, 1802 (268 (and †3) genera, 4,932 (and †11) species), wher the (†) symbol means extinct taxa. Various other sources for Vespidae like Bug Guide and Wikipedia say "nearly" or "approximately" 5,000 species, but I believe the true current number is above 5,000 indeed as you suggest, since species have been added in revisions subsequent to 2013, which is over a decade ago now.

I wonder if any recent-year publication (or any other reference source) such as a species catalogue have exact stats on at least the current number of genera and species, and if possible subspecies (and we already know the number of subfamilies and tribes). Carpenter & Brown (2021) cite Eumeninae alone as having nearly 4,000 species, but then write "see Pickett & Carpenter, 2010," possibly suggesting that none of the published literature to date since Anguier et al. (2013) include the full number of species, etc.

I tried to manually count the vespid genera on GBIF, which are members of the parent taxa Vespidae, "Eumenidae" (Eumeninae), and "Masaridae" (Masarinae), for a total of approx. 161 (which include at least a few duplicates). GBIF does not have parent taxa corresponding to the other valid subfamilies, even when written as if they were families. I didn't count the inaturalist genera, species, or subspecies, but we already know those would be undercounts because inaturalist mostly only adds taxa once they've been photographed, and many haven't.

These would be my questions for the possibility of using the ITIS system (once it's finished):

Where is the complete taxa/subtaxa list coming from for each Vespid rank (the "511 pdfs"?), is there such a complete listing of in any print or online source?

Are there known or expected to be issues similar to GBIF like no longer accepted synonyms, duplicate taxa, non-actual genera or species titled things like "[Genus] spec." (unidentified species of [Genus"), missing taxa, mispelled taxa, etc.? In the event yes, what is the estimated extent of such problems compared to the (large) current extent of such problems on GBIF?

Is there any rough estimation of when the entire ITIS system would be completed? If additional people helped check it or work on it, would that change the estimation by much?

And these are questions for GBIF and ITIS: would any additional GBIF vespid taxon page features, information, functionality, etc. be lost or made to become incorrect if the new ITIS system were to be used? Most importantly, would any of the existing records and their photo galleries become lost or re-assigned to incorrect parent taxa?

DaveNicolson commented 7 months ago

In the absence of a recent & comprehensive catalog to follow, as in this case, Daniel has to dig through the literature to search for national and regional lists for the group or any part of it, building a list of additional papers to find as he goes. Obviously this does not qualify as a taxonomic revision, but he is trying to catch and confront any clear issues along the way (or in our cross-checking).

Daniel is a taxonomist who initially focused on grasshoppers, but he has been working on ITIS data at the Smithsonian for about 20 years, and continues his own publications of course (including Arthropods of Hispanola from 2008 and a 2020 update). He obviously can't create bulletproof lists for everything, but he is good at building these lists out and our team is good at catching issues for further research or resolution.

Based on his comments, I expect we could get this finalized and loaded into ITIS later this year (no firm timeframe, but if it is a priority we can push it). It will likely contain some errors (so it goes!), which we can address (not immediately) as time goes on.

I don't know about how changing sources would affect GBIF functionality, but I imagine that ZOBODAT may have rather more synonymy than Daniel's will (and that could hurt functionality?). For ITIS, we are often unable to provide complete synonymy, particularly when there is no overall source to follow in that. But we have been trying to shift towards more synonymy, sometimes after Daniel's initial work (so in the initial update), sometimes in subsequent updates. And once ITIS shifts to our new online taxonomic workbench, it will be much easier to bring in outside collaborators to make additional updates (or to fill in synonymy, etc.); we hope to make that shift soon.

mdoering commented 7 months ago

As we progress well with the extended COL checklists at least homotypic synonymy isn't much of a problem. We will simply merge other source including probably ZOBODAT so that these other combinations will make it into the extended list which will be used by GBIF. Biggest priority is an up to date taxonomy with most of the currently accepted species in it.

bdagley commented 7 months ago

I agree with this plan. At some point, it will be helpful to have Marco Selis and potentially also James Carpenter who I could contact review the final or near-final lists. For example, as I mentioned, it was relayed to inaturalist that Carpenter didn't (yet) want to make Zethini become Zethinae despite that phylogenetic studies so far are suggesting it will eventually become Zethinae, and more recently, Selis gave reasons for not (yet, fully) implementing a recent revision by Dai et al.. The latter discussion is here: https://www.inaturalist.org/flags/649108. However, it is also possible that in some cases there will be certain agreed-upon differences between the Vespidae taxonomy chosen to be used on GBIF, COL, and ITIS as compared to inaturalist's.

I am familiar with how GBIF currently deals with synonyms, for example referring to current taxa as "accepted taxa," although remain unentirely sure about matters like:

I don't know about how changing sources would affect GBIF functionality, but I imagine that ZOBODAT may have rather more synonymy than Daniel's will (and that could hurt functionality?). For ITIS, we are often unable to provide complete synonymy, particularly when there is no overall source to follow in that. But we have been trying to shift towards more synonymy, sometimes after Daniel's initial work (so in the initial update), sometimes in subsequent updates. And once ITIS shifts to our new online taxonomic workbench, it will be much easier to bring in outside collaborators to make additional updates (or to fill in synonymy, etc.); we hope to make that shift soon.

One other matter, that would probably require further discussion, is whether the records that are only identified to genus, e.g. "Eumenes spec." (meaning "Eumenes species") or that are tentatively considered most likely a particular species (e.g. "Delta cf. esuriens") (which often originate from genetic barcoding websites like BOLDSystems, where I have already found some individual authors submitting records made mistakes in their barcoding process that led to entirely misidentified records in some cases) should be included in the future taxonomy at all shown on GBIF. I at least think the "Eumenes spec."-like records would either be best to consolidate in some way or separate from the rest of the genus (Eumenes) checklist as it's currently displayed on GBIF, if not removed entirely. Currently, multiple such "spec." records are listed at the end of the species checklists of many genera.

I can mention these recent updates in decisions to Selis and a few other taxonomists like R. B. Lopes, and they may potentially have comments I can relay here or want to become directly involved, if possible, which I recommend. I'm not contacting Carpenter at this time but it will be ideal for us to also eventually contact him about this.

bdagley commented 7 months ago

One update. I referred Marco Selis to this page and he read the full discussion. He didn't exactly give a quote for me to relay here, but reminded that many taxonomic revisions in the Vespidae (especially Eumeninae) literature are still known to be needed, and may occur relatively repidly. He has published several himself and is often in the process of working on additional ones. Now speaking again in my own opinion, the inference I would draw from that is even if we were to finalize Vespidae taxonomy now, on ITIS or use it on GBIF and COL, we would still have to make relatively many later changes due to revisions So, to me, this potentially indicates that it may best to wait on editing the GBIF taxonomy for the time being, except in the event that people were very keen to update it as soon as possible but also understanding that many additional changes will be needed over coming years due to Eumeninae taxonomy being in such a state of flux. At this moment, I no longer have an opinion, although in the event that we decide to keep the GBIF/COL taxonomy as it is for the time being, I would still also like to try editing at least one genus (using instructions that were given to me), to test how long the process would take me if we were to continue using the Zobodat database at least for the time being. Although, as said, I have no plans to attempt to completely correct the current taxonomy and am not directly involved in the ITIS list being created.

Pertaining to Vespidae taxonomic changes to expect in the future, the main one is most revisions "remove subspecies" after examining .specimens and either synonynize them (if they only differ in color pattern) or elevate them to become new species (if structural morphological differences are found). Within subfamily Vespinae, all of the former Vespa subspecies have already become revised; I'm unsure for additional general such as Vespula and Dolichovespula, but believe the situation is similar. Similarly, the situation is similar in Polistinae, with most former Polistes subspecies already having been revised (although there are also several other polistine genera where a few subspecies remain). There are also at least two more obscure vespid subspecies, but the situation is similar for them as well. I think it's the major subfamily Eumeninae that really currently retains the most subspecies (to be revised) but which also will need many changes affecting the genus or tribal membership of taxa. Also for example as mentioned, which genera are members of which eumenine tribes is still being determined for a subset of them, and phylogenetic studies do strongly suggest that the current tribe Zethini members will eventually become members of a new subfamily, Zethinae.