Ghini / ghini.desktop

plant collections manager (desktop version)
http://ghini.github.io/
GNU General Public License v2.0
24 stars 14 forks source link

consider GBIF as (alternative) taxonomic source #458

Open mfrasca opened 5 years ago

mfrasca commented 5 years ago

@tmyersdn writes in #440:

I currently prefer GBIF as I find it is gives better information about synonyms, also it is more international in its governance, head office in Copenhagen. https://www.gbif.org/species/search?q=sterculiaceae https://en.wikipedia.org/wiki/Global_Biodiversity_Information_Facility

I had a look and I think that the structure of the result is a lot more usable than results from EOL. http://api.gbif.org/v1/species/match?verbose=false&name=Abies%20argentea

pity the result contains no intermediate taxonomic information between family and genus (see Vanda where I expect some reference to Subfamilia Epidendroideae, Tribus Vandeae, Subtribus Aeridinae or Sterculiaceae where I expect some reference to Sterculioideae) nor between genus and species (see Rhododendron farrerae: subgenus Azaleastrum, sectio Tsutsusi, subsectio Brachycalyx)

mfrasca commented 5 years ago

but this one contains no information about hybrids!

mfrasca commented 5 years ago

I think that hybrid information is relevant enough, missing it: no go.

thomasstjerne commented 5 years ago

Hello @mfrasca GBIF taxonomy includes hybrids, where this information is available from sources: http://api.gbif.org/v1/species/match?name=Pilosella%20officinarum%20x%20Pilosella%20piloselloides%20subsp.%20bauhinii

When using the species match API, you should make another call to the species API with the usageKey to get all information: http://api.gbif.org/v1/species/9645296

Cheers, Thomas Stjernegaard

mfrasca commented 5 years ago

Hi @thomasstjerne, thank you for the hybrid formula example! but what about nothotaxa? like Brassocattleya. compare gbif with tpl with wikipedia

MattBlissett commented 5 years ago

Hi @mfrasca,

Nothotaxa work where we have the data, but we usually follow the Catalogue of Life which unfortunately says it's not a hybrid. You can see the three checklists we see the name (×) Brassocattleya arauji in, including TPL and IPNI showing hybrids. The website runs from the public APIs, so it would be possible to use GBIF's species match API, retrieve the …/related names, then choose the TPL one where it exists (i.e. look for a name from dataset d9a4eedb-e985-4456-ad46-3df8472e00e8). GBIF's matching API only runs against the GBIF backbone, though others have requested that we allow matching to the checklists we index, and we plan to do this.

Also, I think the export of TPL we have could be improved. It doesn't have genus or family names, and I don't think most hybrid names are formatted correctly.

×Brassocattleya fregoniana we have as a hybrid, since it's not in the Catalogue of Life. (nameType=HYBRID is only for hybrid formulae, I think you just need to see the × for nothotaxa.)

We don't have any intermediate ranks.

If testing the matching API, provide kingdom=Plantae to avoid possible homonyms in other kingdoms.

You had one other comment in your email, which I hope you don't mind me quoting here:

I don't find the example, and I might be confused with a different source, but I thought you sometimes provide year of publication, and again not as a separate field.

I think you've probably read our API documentation which links to some zoological examples. The year is part of the standard format for a zoological name: Puma Jardine, 1834 — or else it was the publishedIn field on any name where we have it.

In case it's useful for assessing its suitability, you can download the GBIF backbone checklist as a Darwin Core archive (zipped TSVs) from https://doi.org/10.15468/39omei , and filter for kingdom=6 (Plantae).

We are working on better integration with the Catalogue of Life, and this is likely to include a new version of our checklist API. I don't think there are any new issues here, but I'll tag @mdoering (the developer for GBIF and CoL) anyway.

Thanks!

mfrasca commented 5 years ago

what a huge amount of information! thank you warmly!

tmyersdn commented 5 years ago

Hi @mfrasca -

TPL - something to note is that TPL is not updated very often (current Version 1.1 in September 2013 - see the details on the home page).

mdoering commented 5 years ago

Hi @mfrasca, we do dedicate a property notho to Names in both GBIF and the upcoming CoL+ Clearinghouse. But It is well hidden in GBBIF I can see, we do not expose it in the "NameUsage" aka taxon/species, just in its "Name": https://api.gbif.org/v1/species/3651890/name https://api.gbif.org/v1/species/3651890

mfrasca commented 5 years ago

hallo @mdoering and @MattBlissett! still on nothotaxa: some time ago I had merged genus information from ars-grin with the default database from Bauble. I collected 26924 genus epithets, of which 547 marked as nothotaxa. I just checked my merged collection of genus epithets against ipni through kew's 'reconciliation'. 'reconciliation' gives a reply on 25702 epithets (of 26924), it reports 121 nothotaxa, none of them is in my initial 547 set (so I need to review). in particular, just to spotlight something which is obviously incorrect, only 40 genera in the Orchidaceae family have a hybrid marker in your database, while I have 472 marked thus. you can check my list from bauble/plugins/plants/default/genus.txt and then grep '"x .*345' genus.txt. (345 is the id of family Orchidaceae).

mfrasca commented 5 years ago

(edited the above, I had made a couple of mistakes matching the two lists)