globalbioticinteractions / globalbioticinteractions.github.io

source files for GloBI website
https://globalbioticinteractions.org
MIT License
8 stars 14 forks source link

wrong gaultheria species? #59

Closed naturalistcharlie closed 8 years ago

naturalistcharlie commented 8 years ago

i was just looking at my inaturalist observations that were imported to your site (great idea btw, i am excited about it!) and found this - the wrong Gaultheria species was tagged here:

http://www.globalbioticinteractions.org/#interactionType=interactsWith&accordingTo=http%3A%2F%2Fwww.inaturalist.org%2Fobservations%2F1054431

It should be Gaultheria procumbens and G. hispidula. C

jhpoelen commented 8 years ago

Hi C:

Thanks for pointing this out! GloBI is using a bunch of different taxonomies to cross-check names against. Also, some taxonomies suggest "preferred names" or "recommended names". (I know I am entering dangerous territory here, please bear with me ; ) ). So, one of these services is using a list provided by National Biodiversity Network https://data.nbn.org.uk/Taxa . According to a version of this list, the preferred name of Gaultheria procumbens is . . . Gaultheria shallon (see attached screenshot). However, according to ITIS for instance http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=23657 the name is just fine.

One way to go about this is to contact the curator of https://data.nbn.org.uk/Taxa and suggest a correction. Another way is to change or re-order the name matching.

Would you be willing to contact NBN about this?

-jorrit

screen shot 2015-10-13 at 8 02 29 pm

jhpoelen commented 8 years ago

On second thought, it might make more sense to resolve iNaturalists taxon ids to their name provider (in this case catalogue of life: http://www.catalogueoflife.org/annual-checklist/2010/details/species/id/7041371 from http://www.inaturalist.org/taxa/62376-Gaultheria-procumbens).

Curious to hear your thoughts on this.

naturalistcharlie commented 8 years ago

Aha! I wondered about a taxonomy issue. The iNat taxonomy system is a bit convoluted as it tries to accomodate everyone's needs. See http://www.inaturalist.org/pages/curator+guide#policies Overall we try to use The Plant List for plants, but locally in New England, USA, when it doesn't create conflicts we use Flora of Nova Anglicae.

In the case of this issue I think it is a mistake rather than a taxonomy question. Gaultheria procumbens is a small ground cover and G. shallon is a large shrub to tree. I can't find any reference that indicates these two extremely different species should be merged, so it may just be a typo or something. I'm not sure if you can isolate and remove the mistake or if we need to talk to that other group to get it fixed.

On Wed, Oct 14, 2015 at 12:50 AM, Jorrit Poelen notifications@github.com wrote:

On second thought, it might make more sense to resolve iNaturalists taxon ids to their name provider (in this case catalogue of life: http://www.catalogueoflife.org/annual-checklist/2010/details/species/id/7041371 from http://www.inaturalist.org/taxa/62376-Gaultheria-procumbens).

Curious to hear your thoughts on this.

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#issuecomment-147934327 .

Charlie Hohn Montpelier, Vermont

jhpoelen commented 8 years ago

@kueda - is there any way to trace an inat taxon to its naming authority? E.g. taxon http://www.inaturalist.org/taxa/62376.json seems to point to http://www.catalogueoflife.org/annual-checklist/2010/details/species/id/7041371 on the taxon pages, but I can't seem to find this info in the json responses.

@naturalistcharlie I've sent an email to Chris Raper - Manager of the UK Species Inventory, Angela Marmont Centre for UK Biodiversity, The Natural History Museum . I've included a snippet of the email below. I am curious what he has to say about this.


[...] Also, I wanted to ask you for some advice regarding the NBN UKSI dump that you shared some time ago. According to USKI it seems that recommended species name Gaultheria procumbens is Gaultheria shallon which appear to be very different species. For a discussion thread see https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59 . My questions are: am I using the USKI name list correctly? If so, do you agree with what “naturalistcharlie” is suggesting? How would we go about making a correction to the name list? If you feel comfortable with commenting on GitHub, please do, otherwise, I’ll relay the information. [...]

kueda commented 8 years ago

There is no way to see that information in the iNat API at present, but we could add it. What endpoints would you want to see that info for? Keep in mind that not all of our taxa will have that info, and that it is really strictly meant to track the original provenance of that taxon, not its current placement in iNat's taxonomy, so I'm not entirely sure how useful that would be to you in resolving conflicts between different taxonomies.

naturalistcharlie commented 8 years ago

Thanks Jorrit! Let me know what they say. If truly any group has merged those two species, i would be interesting to know (and very confused as well). i'm not a taxonomist so it's possible I got confused somewhere along the way.

On Wed, Oct 14, 2015 at 1:59 PM, Ken-ichi notifications@github.com wrote:

There is no way to see that information in the iNat API at present, but we could add it. What endpoints would you want to see that info for? Keep in mind that not all of our taxa will have that info, and that it is really strictly meant to track the original provenance of that taxon, not its current placement in iNat's taxonomy, so I'm not entirely sure how useful that would be to you in resolving conflicts between different taxonomies.

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#issuecomment-148136430 .

Charlie Hohn Montpelier, Vermont

jhpoelen commented 8 years ago

@kueda GloBI is using the observation_fields endpoint. I think that having the external or authoritative taxon id would be useful if available for a particular observation/taxon. This would allow GloBI to link and extract taxon hierarchy from the referred authority. I am not trying to resolve conflicts between taxonomies (hard!) but just making sure that name is known somewhere other than the data source and use the full hierarchy to allow searches like "what do birds eat?". I am hoping that projects like http://globalnames.org and http://eol.org can help establish links between taxon ids these taxonomies.

kueda commented 8 years ago

Do you mean http://www.inaturalist.org/observation_fields.json? That would only get you info on the types of fields we have, no the observation field value data. I just wanted to confirm you were using http://www.inaturalist.org/observations.json to get the data. We could add something like http://www.inaturalist.org/observations.json?extras=sources or something to include some of that sourcing data.

jhpoelen commented 8 years ago

oops, I meant to say http://www.inaturalist.org/observation_field_values.json?type=taxon&quality_grade=research (omitting pagination). I'd say it would be super nice to have a taxon source id / source label for those that appear in these observation field values. http://www.inaturalist.org/observation_field_values.json?type=taxon&quality_grade=research&extras=sources would definitely work nicely.

jhpoelen commented 8 years ago

Chris Raper of the National Biodiversity Network was kind enough to share his insights into the matter of the Gaultheria sp. (see below) . As far as I can tell, GloBI didn't consider the author part of the species name, which caused Gaultheria procumbens to be interpreted as Gaultheria procumbens auct., non L. which was subsequently mapped to Gaultheria shallon. So, the root cause is a mapping error by GloBI.

One way to fix this bug is to consider the taxon id rather than the taxon name. This side-steps the (often ambiguous) taxon matching by name. Will keep you informed about the progress of resolving this issue. . .


Chris Raper, Pers. Comm. 16 Oct 2015:

[...] Here is the current situation with the 2 species:

Gaultheria procumbens L. (Species, flowering plant) NON-NATIVE TERRESTRIAL =Checkerberry

Gaultheria shallon Pursh (Species, flowering plant) NON-NATIVE TERRESTRIAL =Gaultheria procumbens auct., non L. =Shallon

They are 2 different taxa but it seems that G.shallon at one time was referred to by the name "G.procumbens" but not in the same sense as Linnaeus' G.procumbens, so we have the junior synonym "Gaultheria procumbens auct., non L." meaning that this name was used by "authors" and is not the same as Linnaeus's name. Confusing ... but there have been some terrible mistakes i nthe past and we have to somehow describe what went on :) [...]

kueda commented 8 years ago

Ok, http://www.inaturalist.org/observation_field_values.json?type=taxon should now include both the source of the taxon (where we got it from) and the taxon schemes, which are representations of what we think is the same taxon concept in other databases. As I said before, you should take all of these with a grain of salt.

jhpoelen commented 8 years ago

@kueda thanks for adding this . . . I am hoping to integrate the source taxonomies into the GloBI ingestion process.

@naturalistcharlie looks like we have a way to correct the taxon linking error that you reported. Hopefully, this will also prevent similar errors from happening in the future. I'll let you know once the fix is in and the data is processed. Thanks for being patient . . .

jhpoelen commented 8 years ago

@kueda just implemented first pass at using taxon scheme ids to resolve names. For now, am only using the GBIF names.

@naturalistcharlie thanks for being patient. . . have a first pass at implementing the fix using a taxon id, instead of a name. This will hopefully prevent the mismatches we've seen. It'll take awhile for the changes to propagate into the data. I'll close the issue whenever the issue no longer occurs in the data.

jhpoelen commented 8 years ago

Expected Gaultheria species are now appearing in GloBI. Instead of name matching, GloBI is now using taxon ids (GBIF ids in case of iNaturalist). Thanks all for reporting and discussing this.

screen shot 2015-11-15 at 9 42 38 am

naturalistcharlie commented 8 years ago

I might have found another one, though I am not certain.

http://www.globalbioticinteractions.org/#interactionType=interactsWith&accordingTo=http%3A%2F%2Fwww.inaturalist.org%2Fobservations%2F1111117

The observed species was B. salicifolia. B. salicina seems to be the same as B. emoryi but not B. salicifolia. The mixup appears it may be on EOL though.

http://www.calflora.org/cgi-bin/species_query.cgi?where-calrecnum=11372

though i don't see any link between this species and salicifolia on EOL http://eol.org/pages/469342/names

On Sun, Nov 15, 2015 at 12:45 PM, Jorrit Poelen notifications@github.com wrote:

Closed #59 https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59 .

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#event-464835573 .

Charlie Hohn Montpelier, Vermont

jhpoelen commented 8 years ago

Hi @naturalistcharlie - thanks for your attention to detail! From what I can tell, the species names Baccharis salicina and Baccharis salicifolia appear to be synonyms according to ITIS and GBIF as shown on eol page http://eol.org/pages/469342/names/synonyms . Also, it seems that the name Baccharis salicina is the preferred name for the species. Since GloBI picks the preferred name, the behavior (see screenshot below) that you reported is expected. If you feel that there's still something funny going on, please open a new issue.

screen shot 2015-11-16 at 8 02 40 am

naturalistcharlie commented 8 years ago

Hmm... I see that EOL does say that. However, I think it's an error (and not one you can change). Salicifolia and Salicina are definitely distinct AFAIK. Anyway, thanks... I won't point out each one I find, taxonomy is a challenge.

On Mon, Nov 16, 2015 at 11:03 AM, Jorrit Poelen notifications@github.com wrote:

Hi @naturalistcharlie https://github.com/naturalistcharlie - thanks for your attention to detail! From what I can tell, the species names Baccharis salicifolia and Baccharis salicifolia appear to be synonyms according to ITIS and GBIF as shown on eol page http://eol.org/pages/469342/names/synonyms . Also, it seems that the name Baccharis salicifolia is the preferred name for the species. Since GloBI picks the preferred name, the behavior (see screenshot below) that you reported is expected. If you feel that there's still something funny going on, please open a new issue https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/new .

[image: screen shot 2015-11-16 at 8 02 40 am] https://cloud.githubusercontent.com/assets/1084872/11186827/756e1038-8c38-11e5-800d-8af8aad4c46e.png

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#issuecomment-157080485 .

Charlie Hohn Montpelier, Vermont

jhpoelen commented 8 years ago

@naturalistcharlie Thanks for the quick reply. I think that one of the cool side-effect of projects like iNaturalist and GloBI is increased integration and usage of published taxonomies. I think that your comments might benefit those that maintain the taxonomies. This is why I am hoping that you can strike up a conversation with folks at ITIS / GBIF to discuss this further. I am very curious what will come out of this and how I can help. Also, please keep sharing your comments on possible data errors: I think your comments are valuable.

naturalistcharlie commented 8 years ago

I made a comment on the EOL page for the species noting that this seems to be an error. I didn't see an option to flag the observation or otherwise contact people. I am not really familiar with that site.

C

On Mon, Nov 16, 2015 at 11:15 AM, Jorrit Poelen notifications@github.com wrote:

@naturalistcharlie https://github.com/naturalistcharlie Thanks for the quick reply. I think that one of the cool side-effect of projects like iNaturalist and GloBI is increased integration and usage of published taxonomies. I think that your comments might benefit those that maintain the taxonomies. This is why I am hoping that you can strike up a conversation with folks at ITIS / GBIF to discuss this further. I am very curious what will come out of this and how I can help. Also, please keep sharing your comments on possible data errors: I think your comments are valuable.

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#issuecomment-157084677 .

Charlie Hohn Montpelier, Vermont

jhpoelen commented 8 years ago

@jhammock @KatjaSchulz - what is the procedure for reporting a possible error in name providers that are included in EOL (see discussion related to this issue)?

KatjaSchulz commented 8 years ago

@naturalistcharlie @jhpoelen Leaving a comment on the affected taxon page is generally the best approach. Note however, that this is not a clear-cut error. Baccharis salicifolia is listed as a synonym of Baccharis salicina in the Global Compositae Checklist, which is the authority used by The Plantlist, Catalogue of Life, and EOL & GBIF via Catalogue of Life. So I guess it's a question of who you want to follow here, Jepson or GCC. Also, note that Jepson does not actually have a listing for Baccharis salicifolia, it only has the subspecies Baccharis salicifolia subsp. salicifolia, so there's something funny going on with their taxonomy in this case.

naturalistcharlie commented 8 years ago

I'm usually a lumper but these species seem pretty different to me. Alas... what can you do?

I believe the reason Jepson only has the subspecies is because that is the only one in California. I don't really like that approach either... but again what can you do?

If it's an open taxonomy question rather than an error... I suppose there are no changes to be made, once the reference is set. Of course iNaturalist is i believe in part pinned to Jepson so that adds another tangle. But thanks much for your response and consideration!

On Mon, Nov 16, 2015 at 12:08 PM, Katja Schulz notifications@github.com wrote:

@naturalistcharlie https://github.com/naturalistcharlie @jhpoelen https://github.com/jhpoelen Leaving a comment on the affected taxon page is generally the best approach. Note however, that this is not a clear-cut error. Baccharis salicifolia is listed as a synonym of Baccharis salicina in the Global Compositae Checklist http://compositae.landcareresearch.co.nz/?Page=NameDetails&NameId=645BA8CD-8218-4541-AEB9-733C7B536C2C, which is the authority used by The Plantlist, Catalogue of Life, and EOL & GBIF via Catalogue of Life. So I guess it's a question of who you want to follow here, Jepson or GCC. Also, note that Jepson does not actually have a listing for Baccharis salicifolia, it only has the subspecies Baccharis salicifolia subsp. salicifolia, so there's something funny going on with their taxonomy in this case.

— Reply to this email directly or view it on GitHub https://github.com/globalbioticinteractions/globalbioticinteractions.github.io/issues/59#issuecomment-157101480 .

Charlie Hohn Montpelier, Vermont