traitecoevo / taxonlookup

A versioned and dynamically updating taxonomic lookup table for land plants
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12517/abstract
Other
31 stars 6 forks source link

Missing genera+species in Ericaceae, not found in plant_lookup.csv #30

Closed rossmounce closed 7 years ago

rossmounce commented 7 years ago

I have checked the 'bad genera' file and it wasn't listed there, nor does it appear to be in plant_lookup.csv https://github.com/traitecoevo/taxonlookup/blob/master/source_data/badGeneraFamilyPairs.csv

Acrothamnus does occur at TPLv1.1, although they are all unresolved species http://www.theplantlist.org/tpl1.1/search?q=Acrothamnus

According to Tropicos the genus was established recently: Published In: Australian Systematic Botany 18: 451. 2005. (Austral. Syst. Bot.)

I may have a few more of these 'missing genera' to report later today...

rossmounce commented 7 years ago

In the same exact same tribe: Styphelieae (Family: Ericaceae) I also see that Androstoma is also missing from plant_lookup.csv too, whilst present on TPLv1.1 http://www.theplantlist.org/tpl1.1/search?q=Androstoma

rossmounce commented 7 years ago

Another in the Ericaceae:

Leptecophylla http://www.theplantlist.org/tpl1.1/search?q=Leptecophylla

13 unresolved species names on TPLv1.1

I can't find anything obviously wrong/invalid about this genus. The only oddity is it isn't in Tropicos. See also APNI: https://biodiversity.org.au/nsl/services/apni?name=LEPTECOPHYLLA&max=100&display=apni&search=true

wcornwell commented 7 years ago

Cool thanks for these! All our processing comes from this file: http://www.theplantlist.org/1.1/browse/A/Ericaceae/Ericaceae.csv

And for some reason these genera aren't in that csv. so they aren't ending up in the lookup.

I can't find anything on tpl that associates the genus with the Ericaceae

That said, it does look like from other sources that the they should be there.
e.g. https://en.wikipedia.org/wiki/Leptecophylla

It would be nice to include an alternative genus-family mapping that helps us get at the tpl 1.1 errors/omissions. Kew has been promising that an replacement/update for tpl is coming, but I haven't seen anything yet? An ideas on how to handle this?

wcornwell commented 7 years ago

Trying to think how to generalize the problem. How are you finding these missing species?

rossmounce commented 7 years ago

I have a big big list of +100k species that I'm contractually-bound not to share. Sorry. Found them by cleaning those names first with Taxonstand (which checks against TPL, so I positively know they exist there as either accepted or unresolved), then with taxonlookup.

What I am at liberty to do however is provide the full list of 81 genus names I encountered (not necessarily 100% of the genera that are affected), that might suffer this problem. So the methodology boils down to: recognised as TPL-present by Taxonstand, but not present in taxonlookup's plant_lookup.csv.

Some are clearly hybrid genera (nothogenera) and thus it's understandable why they aren't included. I don't expect these to be included in plant_lookup.csv e.g. × Sorbocotoneaster × Gasteraloe × Elyhordeum × Triticale

Others appear to be Section names rather than proper genera e.g. Sympagis (Acanthaceae) http://www.theplantlist.org/tpl1.1/search?q=Sympagis Trichera (Caprifoliaceae, Dipsacoideae) http://www.theplantlist.org/tpl1.1/search?q=Trichera

Others I didn't report as an issue because the type and/or only species of the genus has been synonymised into another genus e.g. a) Sredinskya grandis (Trautv.) Fed. is an unresolved name. This name is unresolved, but some data suggest that it is synonymous with Primula grandis Trautv. . http://www.theplantlist.org/tpl1.1/record/kew-2600656

b) Imitaria muirii N.E.Br. is an unresolved name. This name is unresolved, but some data suggest that it is synonymous with Gibbaeum nebrownii Tischler . http://www.theplantlist.org/tpl1.1/record/kew-2862083

It would be of questionable value to include Sredinskya in plant_lookup.csv when most would consider it clearly not valid. However, there is the issue of consistency of treatment. If TPL says XYZ is "unresolved" there is an argument that this should be treated consistently, not with certain cases selectively suppressed/ignored.

Here's the full list for your perusal:

Acrothamnus
Aglossorhyncha
Amarcrinum
Anchusella
Androstoma
Aporophyllum
Apteranthes
Austroderia
Bolivicereus
Butyagrus
Calamphoreus
Calopsis
Camarotis
Chionodoxa
Chrysodracon
Cluytia
Cominsia
Coristospermum
Cryptocereus
Diatelia
Diocirea
Elyhordeum
Erythranthe
Eufragia
Festucopsis
Floribunda
Fuscospora
Gallium
Gasteraloe
Goeppertia
Hartmanthus
Heucherella
Hibanobambusa
Hormuzakia
Hypodematium
Ileostylus
Imitaria
Indoneesiella
Isotrema
Kandis
Karatas
Leptecophylla
Libonia
Lonchitis
Marcetella
Margyracaena
Marniera
Nanalettia
Nelumbium
Neophytum
Nevadensia
Omalanthus
Omphalolappula
Parabenzoin
Pentacoelium
Petagnia
Petesioides
Philippicereus
Phylliopsis
Piptocalyx
Rhaphidophyton
Robsonodendron
Roebuckiella
Sajanella
Schombolaelia
Sedopsis
Setiechinopsis
Somrania
Sorbocotoneaster
Sredinskya
Steris
Sympagis
Taihangia
Tetracarpaea
Thamnocalamnus
Trichera
Tricholoma
Triticale
Trixago
Trixspermum
Webbia

PS If you have any idea what Nanalettia focdiana might actually be (a horrific multi-error typo?), please let me know - I can't find any trace of this name, yet it seems faintly possible it might be a real synonym at least.

wcornwell commented 7 years ago

Makes sense, since taxonstand is querying tpl at the species level, and we're querying it at the family level, and there is apparently some set of species and genera that aren't inside any family.

No idea about Nanalettia focdiana. I have lots of spelling correction files and it's not in any, sorry. But there do seem to be some spelling mistakes on your list e.g. Gallium vs Galium and Thamnocalamnus should be Thamnocalamus.

As for a solution....I can't figure out any way to get TPL to tell me that Leptecophylla is in the Ericaceae, so it's hard to come up with a general solution....maybe we need to start an list of "extra" genera to add? We just need four columns 1. Genus 2. Family 3. # of accepted species 4. # of unresolved species. All the examples you've found seem worth adding if we can.

wcornwell commented 7 years ago

On Nanalettia focdiana, very liberal agrepping returns:

"Manettia_coccinea" "Manettia_hotteana" "Hypnobartlettia_fontana"

none of which are very convincing

wcornwell commented 7 years ago

closed in favor of #37