Open hohonuuli opened 2 years ago
Relates to solution for #128
Important to tackle before more data contribution; let's discuss more
(Pasting a bit of relevant discussion with @ermbutler from Slack)
To do a VARS-style auto-complete, we do NOT combine the scientific and common names. We let the user use common name, but once they commit/save it, the app automatically changes it to the scientific name. To emulate that maybe the best road is to require to the user to type a few characters (maybe at least 3). Then use the query/contains
endpoint to get a list of potential matches. Once the user selects one, use the synonyms
endpoint to get the accepted/scientific name. So for “starfish” the calls are:
The first result in the synonym list is the accepted name.
Anyway, I’m happy to chat with you about but I think combining the terms into Asteroidea (starfish)
is confusing/cluttering for a user.
Also, I wouldn’t use the taxa/info
endpoint to resolve scientific names. WoRMS isn’t 1:many. It’s actually (sort-of) many-to-many. As an example if you use https://fathomnet.org/worms/taxa/info/Loligo%20opalescens it’s actually returning info about a former scientific name. But if you use https://fathomnet.org/worms/synonyms/Loligo%20opalescens, it will correctly resolve the accepted name to Doryteuthis opalescens.
FathomNet concepts require data quality checks to ensure proper formatting for ML training. Below is a summary of various data quality checks for images and annotations and a more detailed document can be found here. Ideally these checks are to be performed both during data upload and periodically after ingestion to maintain data integrity.
Top priority
ala VARS style. When folks enter a term it needs to be constrained to valid input. This requires a naming service. Now that we have both MBARI's VARS KB and a fast WoRMS name server, we can constrain the terms using whatever provider the user selects. Still need a way to allow a user to enter unconstrained terms (This was Karen Osborn's suggest)