SpeciesFileGroup / taxonworks

Workbench for biodiversity informatics.
http://taxonworks.org
Other
87 stars 26 forks source link

Task - Global Names Finder - exact match #2969

Closed proceps closed 1 year ago

proceps commented 2 years ago

I would like to have an option to filter the names with have / have not exact match to taxon_name.cached and taxon_name.cached_original

mjy commented 2 years ago

Are you wanting to differentiate in the various sections as to how the name was found?

proceps commented 2 years ago

I would like to get a list of taxonnames in a publication. All names. Names matrich those already in the database. Names found in publication, but not in TW.

mjy commented 2 years ago

This already exists. Gnfinder playground.

proceps commented 2 years ago

Well. For found names, may be, but for not found names, not that easy. All the lists are mixed with matching and non matching names. See picture for example. image Another example. Erythroneura vitifix recognized as present and the reference to the taxon Erythroneura. In fact the name is a misspelling, but this misspelling is present in the database. It should be recognized as one. image The label for a recognized name should include the author/year.

Basically, I need tuning on the interface and better filterring options. For example, the search find a lot of names with abbreviate genus. It is useless for me, I need to recognize if the name is present in the database or not. Assuming that each species name is spelled as a binomial name in the publication. I should be able to recognize that there is a new combination or a new misspelling in the paper. I cannot do it with present interface. I basically have to do it one by one, the same way I would do it in browse nomenclature. Create a new citation feature is not much of a help. And actually dangerous. Clicking button, automatically creates combination (without page number). For the page number, I have to search the publication manually. Once the citation is created, I cannot delete it (I can only by going to Browse nomenclature).

For experiment, I used this source in 3i project. https://sfg.taxonworks.org/tasks/sources/gnfinder?source_id=19123

Well, I am not sure, if it is possible to recognize clean names, but there is a match index. May be if I can set up the filter treshold it could be little bit more informative.

proceps commented 2 years ago

May be an OCR text with hiperlinks on the name cold be useful.

proceps commented 1 year ago

This is implemented in nomenclature match