Closed Adafede closed 2 years ago
hi @Adafede, thanks for submitting it
yes you are right, I just released a new version of gnames, where I tried to fix a few inconsistencies in API, and as a result v0.8.0 has quite a few breaking changes, sorry about adding extra work for you, but I think in a long run these changes will be justified. Please read https://github.com/gnames/gnames/releases/tag/v0.8.0
The purpose of most of these changes was to bring search and verification behavior to almost identical. Before "data-sources" option as well as "show all matches" option behaved differently and it was very confusing. In short:
-s 0
option does not exist at all anymore, it is the default behavior now.
-s N,N2,N3
option now limits search to provided data-sources
-M
option now shows all results
BestResult in csv/tsv format indicates the best result for a name SortedMatch shows all other results, and, as you see, they are sorted, and there is new field 'SortScore' that illustrates the numbers that were used for sorting.
Again apologies for breaking changes, there are quite a few of them, because I tried to put all of them into one release, so I do not break others scripts too often. I hope it is now close to stabilizing, but until /api/v0
will be /api/v2
I will listen for suggestions on improving API.
At the start of the next week I will make a blog post that will explain the changes I made.
No problems, I know it is for the good! 😉
My question is regarding -s 1,2,3
.
To me, it should work without -M
but it does not, am I right?
➜ lotus-processor git:(main) ✗ bin/gnverifier "Iris pallida" -s 1,2,3,4,179
00:40:14 INF Using config file: /Users/rutza/Library/Application Support/gnverifier.yaml.
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,9.01315,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,3PZY6,Iris pallida Lam.,false,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris pallida,
can you show the version of gnverifier?
With v0.8.2 I get:
✦ ❯ gnverifier "Iris pallida" -s 1,2,3,4,179
18:09:18 INF Using config file: /home/dimus/.config/gnverifier.yaml.
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,9.01315,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,3PZY6,Iris pallida Lam.,false,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris pallida,
It shows best result only, and with -M
I get all of them
✦ ❯ gnverifier "Iris pallida" -s 1,2,3,4,179 -M -q
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,9.01315,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,3PZY6,Iris pallida Lam.,false,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris pallida,
SortedMatch,9.01005,Exact,0,Iris pallida,"Iris pallida Salisb., nom. illeg.",Iris pallida,3PZY5,Iris halophila Pall.,true,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris halophila,
SortedMatch,9.01005,Exact,0,Iris pallida,"Iris pallida Ten., nom. illeg.",Iris pallida,3PZY4,Iris germanica L.,true,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris germanica,
SortedMatch,8.98392,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,43223,Iris pallida Lam.,false,3,ITIS,Plantae|Viridiplantae|Streptophyta|Embryophyta|Tracheophyta|Spermatophytina|Magnoliopsida|Lilianae|Asparagales|Iridaceae|Iris|Iris pallida,
SortedMatch,8.98012,Exact,0,Iris pallida,Iris pallida,Iris pallida,49342,Iris pallida,false,2,Wikispecies,,
SortedMatch,8.94848,Exact,0,Iris pallida,Iris pallida,Iris pallida,259588,Iris pallida,false,179,Open Tree of Life,||Eukaryota|Archaeplastida|Chloroplastida|Streptophyta|Embryophyta|Tracheophyta|Euphyllophyta|Spermatophyta|Magnoliopsida|Mesangiospermae|Liliopsida|Petrosaviidae|Asparagales|Iridaceae|Iridoideae|Irideae|Iris|Iris pallida,
SortedMatch,8.91436,Exact,0,Iris pallida,Iris pallida,Iris pallida,29817,Iris pallida,false,4,NCBI,|Eukaryota|Viridiplantae|Streptophyta|Streptophytina|Embryophyta|Tracheophyta|Euphyllophyta|Spermatophyta|Magnoliopsida|Mesangiospermae|Liliopsida|Petrosaviidae|Asparagales|Iridaceae|Iridoideae|Irideae|Iris|Iris pallida,
Here:
➜ lotus-processor git:(main) ✗ bin/gnverifier -V
version: v0.8.2
build: 2022-02-25_22:48:34UTC
➜ lotus-processor git:(main) ✗ bin/gnverifier "Iris pallida" -s 1,2,3,4,179
01:11:35 INF Using config file: /Users/rutza/Library/Application Support/gnverifier.yaml.
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,9.01315,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,3PZY6,Iris pallida Lam.,false,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris pallida,
➜ lotus-processor git:(main) ✗ bin/gnverifier "Iris pallida" -s 1,2,3,4,179 -M
01:11:45 INF Using config file: /Users/rutza/Library/Application Support/gnverifier.yaml.
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,9.01315,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,3PZY6,Iris pallida Lam.,false,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris pallida,
SortedMatch,9.01005,Exact,0,Iris pallida,"Iris pallida Salisb., nom. illeg.",Iris pallida,3PZY5,Iris halophila Pall.,true,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris halophila,
SortedMatch,9.01005,Exact,0,Iris pallida,"Iris pallida Ten., nom. illeg.",Iris pallida,3PZY4,Iris germanica L.,true,1,Catalogue of Life,Biota|Plantae|Tracheophyta|Liliopsida|Asparagales|Iridaceae|Iris|Iris germanica,
SortedMatch,8.98392,Exact,0,Iris pallida,Iris pallida Lam.,Iris pallida,43223,Iris pallida Lam.,false,3,ITIS,Plantae|Viridiplantae|Streptophyta|Embryophyta|Tracheophyta|Spermatophytina|Magnoliopsida|Lilianae|Asparagales|Iridaceae|Iris|Iris pallida,
SortedMatch,8.98012,Exact,0,Iris pallida,Iris pallida,Iris pallida,49342,Iris pallida,false,2,Wikispecies,,
SortedMatch,8.94848,Exact,0,Iris pallida,Iris pallida,Iris pallida,259588,Iris pallida,false,179,Open Tree of Life,||Eukaryota|Archaeplastida|Chloroplastida|Streptophyta|Embryophyta|Tracheophyta|Euphyllophyta|Spermatophyta|Magnoliopsida|Mesangiospermae|Liliopsida|Petrosaviidae|Asparagales|Iridaceae|Iridoideae|Irideae|Iris|Iris pallida,
SortedMatch,8.91436,Exact,0,Iris pallida,Iris pallida,Iris pallida,29817,Iris pallida,false,4,NCBI,|Eukaryota|Viridiplantae|Streptophyta|Streptophytina|Embryophyta|Tracheophyta|Euphyllophyta|Spermatophyta|Magnoliopsida|Mesangiospermae|Liliopsida|Petrosaviidae|Asparagales|Iridaceae|Iridoideae|Irideae|Iris|Iris pallida,
I would expect the -M
result without the -M
.
And with -M
to actually obtain possible multiples per source
Ah, yes, I understood now. It is the change in verification procedure. It works like this now:
no flags -- searches everything, returns best -s flag -- searches only data-sources that you want and returns only best result. The result is limited to provided data-sources -M -- shows everything found
Nothing, read too quick!
What I want was -s 1,2,3 -M so! Thanks a lot!
An option to pick only best result per data-source: I can add it if i get a request. Did I get a request from you? ;)
I did not add it, because it is easy to pick the first result for each source using a script. So I am waiting for a request from people to have this option again.
Hmmm...isn't it what -s 1,2,3 -M
is doing?
Only if there is just one match per source. For example:
✦ ❯ gnverifier "Jsoetes longissimum" -s 158 -M -q
Kind,SortScore,MatchType,EditDistance,ScientificName,MatchedName,MatchedCanonical,TaxonId,CurrentName,Synonym,DataSourceId,DataSourceTitle,ClassificationPath,Error
BestMatch,8.79803,Fuzzy,1,Jsoetes longissimum,Isoetes longissimum Bory,Isoetes longissimum,144750512,Isoetes longissimum Bory,false,158,EUNIS,,
SortedMatch,7.96012,Fuzzy,3,Jsoetes longissimum,Isoetes longissima Bory,Isoetes longissima,144848706,Isoetes longissima Bory,false,158,EUNIS,,
returns several
I see, no problem to filter per best score in a small script as you mentioned to me, so no request your work is consequent enough! 😄
Hmmm...last annoying question, how can I obtain the equivalent of
gnverifier "Iris pallida" -s 1,2,3,4,179 -M -q
using gnfinder
?
I think there is no -M
in gnfinder
Fine no worries, I'll do "old school", running gnverifier on the top of gnfinder as long time ago in the meantime :)
Hi @dimus !
Following your last updates, I am quite confused. I am facing strange behaviors I can't explain...
Here are some trials to diagnose:
I would say:
0
option does not work-M
helps but I understood it should be multiple per source and not in total)Sorry I can't help more... happy to further test/develop