Closed nleguillarme closed 3 years ago
@nleguillarme great idea! Would you happen to know how to access all the data associated with Index Fungorum?
Please note that Global Names already has some support for resolving Index Fungorum names, but unfortunately, this is not yet available for off-line processing @dimus .
Hi @jhpoelen, Index Fungorum seems to have an API for resolving both names and ids : http://www.indexfungorum.org/ixfwebservice/fungus.asmx
@nleguillarme @jhpoelen I write to Paul Kirk periodically, he sends me somewhat idiosincratic data, that I convert and import into https://verifier.globalnames.org. The data can be found in the data dump http://opendata.globalnames.org/dumps/gnames-2020-11-28.tar.gz
I am going to update his data in the next month or two
the last data received from Paul Kirk can always be found in this file: https://github.com/GlobalNamesArchitecture/dwca_hunter/blob/master/lib/dwca_hunter/resources/index-fungorum.rb
currently they are at https://uofi.box.com/shared/static/54l3b7h4q4pwqq4fgqvx42h3d328fl1c.csv
@nleguillarme @dimus thanks for the info and ideas -
I made a first pass at offline-enabled Index Fungorum support and got results like:
$ echo "IF:177054" | nomer append indexfungorum
using matcher [indexfungorum]
[INDEX_FUNGORUM] taxonomy importing...
caching [https://uofi.box.com/shared/static/54l3b7h4q4pwqq4fgqvx42h3d328fl1c.csv] at [/media/jorrit/branta/nomer/1967320cc97d53ea9343a0611907accbb27344f4f4975d050d7aa7ea4486b80e.gz]...
Cookie rejected: "$Version=0; box_visitor_id=61805741abb017.31353894; $Path=/; $Domain=.box.com". Domain attribute ".box.com" violates RFC 2109: host minus domain may not contain any dots
Cookie rejected: "$Version=0; site_preference=desktop; $Path=/; $Domain=.box.com". Domain attribute ".box.com" violates RFC 2109: host minus domain may not contain any dots
Cookie rejected: "$Version=0; b=e8aa55fa47b0dde6f0f9bc54c0af9fb97375616c53c8a614f57524e29798f890; $Path=/; $Domain=.public.boxcloud.com". Illegal domain attribute ".public.boxcloud.com". Domain of origin: "public.boxcloud.com"
caching [https://uofi.box.com/shared/static/54l3b7h4q4pwqq4fgqvx42h3d328fl1c.csv] at [/media/jorrit/branta/nomer/1967320cc97d53ea9343a0611907accbb27344f4f4975d050d7aa7ea4486b80e.gz] done.
using cached [https://uofi.box.com/shared/static/54l3b7h4q4pwqq4fgqvx42h3d328fl1c.csv] at [/media/jorrit/branta/nomer/1967320cc97d53ea9343a0611907accbb27344f4f4975d050d7aa7ea4486b80e.gz]
cache with [547875] items built in [423.3] s or [1294.3] items/s.
[INDEX_FUNGORUM] taxonomy imported.
IF:177054 SYNONYM_OF IF:808518 Leucocybe candicans Fungi | Basidiomycota | Agaricomycotina | Agaricomycetes | Agaricomycetidae | Agaricales | Incertae sedis kingdom | phylum | subphylum | class | subclass | order | family http://www.indexfungorum.org/names/NamesRecord.asp?RecordID=808518
Note that you can also list all the indexfungorum names using
$ nomer dump indexfungorum
using matcher [indexfungorum]
[INDEX_FUNGORUM] taxonomy already indexed at [/media/jorrit/branta/nomer/index_fungorum/index_fungorum], no need to import.
IF:1 Michenera SYNONYM_OF IF:17976 Licrostroma Fungi | Basidiomycota | Agaricomycotina | Agaricomycetes | Incertae sedis | Russulales | Peniophoraceae kingdom | phylum | subphylum | class | subclass | order | family http://www.indexfungorum.org/names/NamesRecord.asp?RecordID=17976
IF:2 Abaphospora SYNONYM_OF IF:3016 Massarina Fungi | Ascomycota | Pezizomycotina | Dothideomycetes | Pleosporomycetidae | Pleosporales | Massarinaceae kingdom | phylum | subphylum | class | subclass | order | family http://www.indexfungorum.org/names/NamesRecord.asp?RecordID=3016
IF:3 Abrothallomyces SYNONYM_OF IF:4 Abrothallus Fungi | Ascomycota | Pezizomycotina | Dothideomycetes | Incertae sedis | Abrothallales | Abrothallaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=4
IF:4 Abrothallus SYNONYM_OF IF:4 Abrothallus Fungi | Ascomycota | Pezizomycotina | Dothideomycetes | Incertae sedis | Abrothallales | Abrothallaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=4
IF:5 Absconditella SYNONYM_OF IF:5 Absconditella Fungi | Ascomycota | Pezizomycotina | Lecanoromycetes | Ostropomycetidae | Ostropales | Stictidaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=5
IF:6 Abyssomyces SYNONYM_OF IF:6 Abyssomyces Fungi | Ascomycota | Pezizomycotina | Sordariomycetes | Incertae sedis | Incertae sedis | Incertae sedis kingdom | phylum | subphylum | class | subclass | order | family http://www.indexfungorum.org/names/NamesRecord.asp?RecordID=6
IF:7 Acallomyces SYNONYM_OF IF:7 Acallomyces Fungi | Ascomycota | Pezizomycotina | Laboulbeniomycetes | Laboulbeniomycetidae | Laboulbeniales | Laboulbeniaceae kingdom | phylum | subphylum | class | subclass | order | family http://www.indexfungorum.org/names/NamesRecord.asp?RecordID=7
IF:8 Acantharia SYNONYM_OF IF:8 Acantharia Fungi | Ascomycota | Pezizomycotina | Dothideomycetes | Pleosporomycetidae | Venturiales | Venturiaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=8
IF:9 Acanthographina SYNONYM_OF IF:24 Acanthothecis Fungi | Ascomycota | Pezizomycotina | Lecanoromycetes | Ostropomycetidae | Ostropales | Graphidaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=24
IF:10 Acanthographis SYNONYM_OF IF:24 Acanthothecis Fungi | Ascomycota | Pezizomycotina | Lecanoromycetes | Ostropomycetidae | Ostropales | Graphidaceae kingdom | phylum | subphylum | class | subclass | order | familyhttp://www.indexfungorum.org/names/NamesRecord.asp?RecordID=24
...
Feeding the ~0.5M Index Fungorum names into itself
$ nomer dump indexfungorum | nomer append indexfungorum | pv -l > /dev/null
using matcher [indexfungorum]
using matcher [indexfungorum]
[INDEX_FUNGORUM] taxonomy already indexed at [/media/jorrit/branta/nomer/index_fungorum/index_fungorum], no need to import.
[INDEX_FUNGORUM] taxonomy already indexed at [/media/jorrit/branta/nomer/index_fungorum/index_fungorum], no need to import. ]
547k 0:01:16 [7.13k/s] [
took a little over 1 minute without need for internet connectivity.
with
$ time nomer dump indexfungorum | cut -f1,2 | pv -l | gzip > names.tsv.gz
using matcher [indexfungorum]
[INDEX_FUNGORUM] taxonomy already indexed at [/media/jorrit/branta/nomer/index_fungorum/index_fungorum], no need to import.
547k 0:00:24 [22.3k/s] [ <=> ]
real 0m24.579s
user 0m58.959s
sys 0m4.415s
and
$ time zcat names.tsv.gz | nomer append indexfungorum | pv -l > /dev/null
using matcher [indexfungorum]
[INDEX_FUNGORUM] taxonomy already indexed at [/media/jorrit/branta/nomer/index_fungorum/index_fungorum], no need to import.
547k 0:00:41 [13.1k/s] [ <=> ]
real 0m41.914s
user 1m19.796s
sys 0m7.136s
Note that this is all single threaded, and without any kind of optimization.
Index Fungorum matcher is now available in Nomer v0.2.5 https://github.com/globalbioticinteractions/nomer/releases/tag/0.2.5 .
@nleguillarme please review and confirm desired functionality by closing this issue.
@nleguillarme closing issue, please re-open if you find any issues.
Thank you @jhpoelen, it works like a charm.
@nleguillarme thanks for trying out the Index Fungorum matcher . . . happy to hear any suggestions on improvements or funny things that come up as you are using the newly added taxonomic scheme.
Thanks again to @dimus for helping to find access to an easy to use version of Index Fungorum.
Hi, it would be nice to add a matcher for the Index Fungorum taxonomy : http://www.indexfungorum.org/names/names.asp