Open javiermerino-tracasa opened 2 years ago
Hi @javiermerino-tracasa
You can limit the search to a given higher taxon, e.g. Plantae using the TAXON_ID
parameter.
The default match type is WHOLE_WORDS
which means that any name having rivas in the author string will match.
If you change the match type to EXACT
you will only get matches that have your query as a full substring.
Using TAXON_ID=P
and type=EXACT
will find your species here:
https://api.catalogueoflife.org/dataset/3LR/nameusage/search?TAXON_ID=P&limit=50&offset=0&q=Morella%20rivas-martinezii&type=EXACT
You may however find that in some cases type=EXACT
is too narrow when it comes to e.g. spelling variants and different author spellings.
Hello Thomas,
We are using the name search for getting information for hundreds of names in an automated fashion. We cannot rely on the taxon_id suggestion since we will not know the taxonomy for each name beforehand.
As for EXACT, we can try using it first, and for those that it cannot find anything, then we do whole_words. This will improve the results for those names that match perfectly. However, it will remain as inaccurate for the examples I showed above.
Thanks. Javier
Alternatively you can also use the PREFIX type search when you want the name to start with your query string. In your example the querystring Morella rivas-martinezii
will be taken as 3 tokens: Morella
, rivas
& martinezii
which will match the name or authorship. You can also restrict the matching to the name alone, avoiding matches to the authorship which in your example hits Rivas
almost all the time:
Hello Markus,
Thanks for the help. I am now using type EXACT and then when it fails, content scientific name and it is working much better for us.
Hello, As part of our update to EUNIS2 database, we are getting some species information from CoL. We are getting taxonomy from searching by name. For the most part, around 80% of the time, we get accurate results with the search, but for the remaining 20% we get unusually random erroneous results. Below is an example of a plant that returns a bacteria result. <https://api.catalogue.life/dataset/3LR/nameusage/search?q=Morella rivas-martinezii&limit=300> It is not until it reaches search result number 227 when it finally finds the actual taxonomy for "Morella rivas-martinezii"
Here are some two more examples:
Centranthus amazonum -> kingdom:Animalia, phylum:Chordata, class:Amphibia, order:Anura, family:Bufonidae, genus:Bufotes, species:Bufotes boulengeri Pastor roseus -> kingdom:Animalia, phylum:Nematoda, class:Chromadorea, subclass:Chromadoria, order:Chromadorida, suborder:Chromadorina, superfamily:Chromadoroidea, family:Chromadoridae, subfamily:Euchromadorinae, genus:Crestanema
Also, when the name ends in "all others" like for instance, "Periparus ater all others", then the search always assigns it the same result, with it being: kingdom:Plantae, phylum:Tracheophyta, class:Magnoliopsida, order:Caryophyllales, family:Amaranthaceae, subfamily:Chenopodioideae, genus:Bassia
Is this a bug in the search API or do I need some additional filters? Thanks a lot in advance. Javier