ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Scientific Name Checker returns false "is accepted name" status #2253

Closed sharpphyl closed 3 years ago

sharpphyl commented 5 years ago

Today I ran a list of names against Data Services: Scientific Name Checker.

The STATUS of a number of the taxon names was returned as "is accepted name" but when I opened the Arctos taxon record, the status is "invalid." Also, nothing was in the SUGGESTED_SCI_NAME column.

For example, Candidula gigaxii is listed as an accepted name.

Screen Shot 2019-09-09 at 11 25 55 AM

But opening that taxon name shows that the status is invalid.

Screen Shot 2019-09-09 at 11 18 22 AM

The valid name is Backeljaia gigaxii.

Screen Shot 2019-09-09 at 11 18 44 AM

I thought perhaps it was because the WoRMS (via Arctos) name was invalid, but the Arctos name is also invalid.

Is this a bug? It would seem that the report should have shown the valid synonym for the taxon name I searched. I can always run a Taxon Match in WoRMS and just not use this report, but this could be a problem for other users too.

Also, can the scientific name checker be used to find the status of WoRMS (via Arctos) terms or only the status in the Arctos source?

dustymc commented 5 years ago

You are confounding names and classifications.

The check returns "is in Arctos yes/no."

The 'valid..' stuff is classification metadata - that's "usage suggestions" not name validity (or in this case, mere existence in Arctos).

DerekSikes commented 5 years ago

Please consider changing terms to those that will disambiguate this situation.

If 'valid' in this case means "is in Arctos" then please change to "is in Arctos"

-Derek

On Mon, Sep 9, 2019 at 9:57 AM dustymc notifications@github.com wrote:

You are confounding names and classifications.

The check returns "is in Arctos yes/no."

The 'valid..' stuff is classification metadata - that's "usage suggestions" not name validity (or in this case, mere existence in Arctos).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/2253?email_source=notifications&email_token=ACFNUMZXWTFVRZXQM4ZD45LQI2FALA5CNFSM4IU6BY62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6IPXKI#issuecomment-529595305, or mute the thread https://github.com/notifications/unsubscribe-auth/ACFNUMZ5MLSOPTOTKSVFN63QI2FALANCNFSM4IU6BY6Q .

--

+++++++++++++++++++++++++++++++++++ Derek S. Sikes, Curator of Insects Professor of Entomology University of Alaska Museum 1962 Yukon Drive Fairbanks, AK 99775-6960

dssikes@alaska.edu

phone: 907-474-6278 FAX: 907-474-5469

University of Alaska Museum - search 400,276 digitized arthropod records http://arctos.database.museum/uam_ento_all http://www.uaf.edu/museum/collections/ento/ +++++++++++++++++++++++++++++++++++

Interested in Alaskan Entomology? Join the Alaska Entomological Society and / or sign up for the email listserv "Alaska Entomological Network" at http://www.akentsoc.org/contact_us http://www.akentsoc.org/contact.php

sharpphyl commented 5 years ago

Thanks for the suggestion, Derek, and for the clarification, Dusty.

Currently the definition of the field STATUS in the instructions reads:

STATUS - a possibly-useful indication of what might have happened and how we came up with whatever it is that we're suggesting.

That is totally open to individual interpretation when the report returns a STATUS of "is_accepted_name." When it's not in Arctos, the STATUS is "FAIL."

"Accepted" is used by WoRMS as a true taxon status and "accepted" appears in our definition of a "valid" taxon name in the CTTAXON STATUS code table, so we need to purge that word from this report.

STATUS is used as a taxon status so I recommend that we also change the name of that field. Perhaps "IS IN A LOCAL SOURCE" with the options of YES and NO. Or AVAILABILITY, with options of "is in Arctos" and "is not in Arctos."

The instructions should clarify that even if the answer is YES, it only means that the taxon name is in one of the three local sources (Arctos, Arctos Plants and WoRMS (via Arctos)), but may or may not be a valid taxon name, and may not even have a higher classification. Is that an accurate interpretation of what the report shows?

Also the report name "SCIENTIFIC NAME CHECKER" sounds like it's checking the validity of scientific names in Arctos. Perhaps it should be AVAILABILITY OF TAXON NAMES.

FWIW, without the metadata, the report is of limited value and I'm better off either searching WoRMS directly or each taxon name one by one. Is it possible to enhance the report with the local source and the status from the CTTAXON_STATUS table? If no one else uses this report extensively, I'll work with the WoRMS Match Taxa tool since I know anything in WoRMS is now in Arctos.

dustymc commented 5 years ago

I'm fine with changing that to read whatever.

That is a NAME checker - it checks NAMES, not classifications. Like most everything in Arctos, its value depends on what you're trying to do. It's probably saved several hundreds of hours of work for me, so I think it's pretty cool....

Everything this tools finds should, by virtue of being in Arctos, be a valid taxon name.

sharpphyl commented 5 years ago

Everything this tools finds should, by virtue of being in Arctos, be a valid taxon name.

When you say this, do you mean that the taxon status of the name is "valid" or just that it's a real name even if it's now an invalid synonym. Again, I think that it's an "available" name that's in the Arctos taxonomic table, but it may or may not be valid. (Unfortunately, right now, there are still a lot of misspelled and erroneous names in Arctos but you're right that they should all be "real" taxon names.)

Also, what's the difference between a "scientific name" and a "taxon name." We have two reports, the one discussed here "Scientific Name Checker" and the next one "Taxon Name Validator. " I understand the difference between the two reports but what is the difference between the two types of names?

dustymc commented 5 years ago

taxon status

That's classifications and not within the purview of the tool.

it's a real name

Yes, or someone at some point thought it was real enough to include in Arctos anyway.

what's the difference between a "scientific name" and a "taxon name."

I probably use those interchangeably. Like everywhere, context matters - there are two very different "core" fields called 'scientific_name' and maybe dozens of ancillary in Arctos.

Jegelewicz commented 5 years ago

Yes, or someone at some point thought it was real enough to include in Arctos anyway.

I think you overestimate our knowledge of taxonomic names....

There is a lot of garbage in the names list. I prefer the suggestion that the results say, "name exists in Arctos" or "name is not in Arctos". Not sure what other things are generated, but "valid" is misleading.

Also, what's the difference between a "scientific name" and a "taxon name." We have two reports, the one discussed here "Scientific Name Checker" and the next one "Taxon Name Validator. " I understand the difference between the two reports but what is the difference between the two types of names?

There isn't a difference and we should pick one term and stick with it to be consistent. I suggest the following:

Scientific Name Checker - change to "Taxon Name in Arctos?" Taxon Validator can stay as it is

sharpphyl commented 5 years ago

Dusty, can we change the instructions so others don’t misunderstand the purpose of this report. Here are my suggestions. Note that I’m recommending that “is_accepted_name” be changed to “name_in_Arctos.”

Also, I have no idea what SUGGESTED_SCI-NAME is. I thought it was the valid synonym so this explanation really needs to be clarified. How is the suggested name created? Is this the Arctos_relationship or the “possible alternate spelling” metadata or what?


This report is helpful to use prior to bulkloading your specimens. It tells you if a taxon name is in Arctos. If it is not, you will need to add it prior to bulkloading that specimen record.

Load a csv with one column titled “scientific_name.” The report will tell you which scientific names are in Arctos. Scientific names in Arctos will read “name in Arctos.” Scientific names that are not in Arctos will read “FAIL”.

See bulkloader taxonomy documentation and Taxonomy documentation for the full scoop on taxonomy.

This form considers only namestrings (that is, taxon_name.scientific_name) so will have a high false failure rate for data with complex names (Name sp., etc.)

Returned data will be

SCIENTIFIC_NAME: the name you loaded. STATUS: this will show whether or not the taxon name is in Arctos. It says nothing about whether the taxon status is valid or invalid. It does not tell you if the name has a higher classification. It only tells you if the name is or is not in the Arctos taxonomic table. SUGGESTED_SCI_NAME: this is what we think you meant. Replace your SCIENTIFIC_NAME with this and Arctos will probably be happy. You might not be though, so make sure you know what you're doing. (See above comment - this needs to be clarified)

No changes to the rest of the instructions.

dustymc commented 5 years ago

SUGGESTED_SCI-NAME

The script tries to figure out what you meant, and can sometimes correct minor misspellings and formatting errors and etc.

sharpphyl commented 5 years ago

Ok, your explanation above would be fine. Are you ok with making these changes to the instructions or does it need review by others? For me, I can close the issue, but I hate to have others make the same misinterpretations that I did.

dustymc commented 5 years ago

No definitely leave this open until things are changed and documented, which probably won't happen in the immediate future.

Jegelewicz commented 5 years ago

/remind me to add to documentation tomorrow

reminders[bot] commented 5 years ago

@Jegelewicz set a reminder for Sep 12th 2019

reminders[bot] commented 5 years ago

:wave: @Jegelewicz, add to documentation

Jegelewicz commented 4 years ago

@dustymc I branched this page with edits (slightly modified) as suggested in https://github.com/ArctosDB/arctos/issues/2253#issuecomment-530535714

If it's OK, please merge them in. If not, let me know what needs fixing.

dustymc commented 4 years ago

I'm way more than "OK" with this! Yay/thanks! But...

1) it looks like you started with /master, which is precisely what you should have done, except we don't have the box that runs that. Prod is currently v7.12. 2) I can't figure out how to merge into a branch. 3) I don't have any way to test this anyway, see (1).

I just manually replaced mine with yours.

Screen Shot 2019-10-14 at 11 07 30 AM

...and fixed a span-->html

The best way to do this now is probably to get you a TACC account so you can pull to "test" on production - I think Chris would need to set that up. We can discuss commit hooks on test when we get that back.

Jegelewicz commented 4 years ago

Just tell me what I need to do from here - I am more than happy to edit non-operational code.

dustymc commented 4 years ago

I'll definitely take whatever help you're willing to offer.

With a test box:

Current situation

If you're willing to endure that second scenario, you should just be able to ask Chris how you can SSH to the production webserver, from where you can figure out the current version and pull from github. (You will need a tacc account, but I think access is separate or additional or something.) I'm happy to help with the details.

I absolutely understand if you're not willing to face that - I'm certainly not very happy about finding myself here! - and we can revisit this once we have a test server.

Jegelewicz commented 4 years ago

I just manually replaced mine with yours.

Well, some of it came out as expected, some not so much. I expected some spaces between some of the lines and stuff in the red box should have been deleted and stuff in the purple box shouldn't be there:

image

I'm willing to endure the second scenario because I expect to only be "breaking" descriptive stuff on any given page. I can see that the formatting I inserted worked in some cases, but not in others, so I have some learning to do before I have it down. Is this stuff in html? (what language should I be studying up on?)

The first thing I'd like to do is to make all of the tool pages have a similar format:

Title Description Required file format for upload (with link to a template) "browse" and "upload" buttons

dustymc commented 4 years ago

Excellent. Let me know how I can help.

Yes it's HTML, ish. There's probably some CFML mixed in here and there, but it'll be easy to pick up in that context. I'll get a QnD "developers guide" started and we can add to it as we break things!

dustymc commented 4 years ago

http://handbook.arctosdb.org/how_to/developer-guide.html is hopefully the start of something we probably should have had a long time ago.

Jegelewicz commented 4 years ago

Wait - are you telling me that if I make the changes to v7.12 they should eventually show up in production?

dustymc commented 4 years ago

Yes, and if you get your TACC account wired into the web box you'll have full control of "eventually."

One issue (that I'm struggling with right now, without a lot of hope of an elegant solution) is that they'll also need patched into the PG repo, and you currently need a tunnel to tacc to access that.

sharpphyl commented 3 years ago

No definitely leave this open until things are changed and documented, which probably won't happen in the immediate future. (Sept. 11, 2019)

@Jegelewicz and @dustymc This was closed but the instructions were never changed. I know what they mean now but it would be nice if new collections didn't have to try to figure it out. This issue is still in the Taxonomy Committee's project list under Not Reviewed. Teresa, do we need to discuss further or just put it back into our wish list?

Jegelewicz commented 3 years ago

Let's keep it open - it is assigned to me and I'll try to get to it...

Jegelewicz commented 3 years ago

@dustymc I have edited this page to conform to the above.

Suggest we change everything in this tool from scientific name to taxon name.

dustymc commented 3 years ago

Pulled - don't you have access now?? - needs an update.

I'll readily admit to flip-flopping back and forth between taxon_name and scientific_name (it's a long story!), but the form that this form exists to feed needs scientific_name - why break that?

Jegelewicz commented 3 years ago

but the form that this form exists to feed needs scientific_name - why break that?

Because mineral and cultural people will be using this tool...

Jegelewicz commented 3 years ago

Pulled - don't you have access now??

I was able to get in a few months back, but never able to successfully pull. Not sure I can find the directions again!

Jegelewicz commented 3 years ago

Fixed that.

dustymc commented 3 years ago

mineral and cultural people w

Hu?

successfully pull.

Oh yea - and it was doing something weird to permissions and I never for around to scrounging up some sysadmin help. Yet another thing https://github.com/ArctosDB/internal/issues/64 would greatly simplify.

Jegelewicz commented 3 years ago

This tool tells you if a name is in Arctos - this matters to all collections, not just biological ones!

dustymc commented 3 years ago

I still don't get it. The tool is built to tell you if what you're trying to load as names is going to explode. All collections use the same name structure. The 'trying to figure out what was intended' bits are going to be worthless to the cultural people unless we change something, but that's no reason to make them swap column headers and then swap them back again!

Jegelewicz commented 3 years ago

The tool ALSO tells you if the name exists in Arctos - which is how @sharpphyl and I have mainly used it. When a new cultural collection shows up with a new set of identifications, this tool will help them figure out if their identifications already exist in Arctos or if they need to add names.

Jegelewicz commented 3 years ago

Like this. Copy of sciname_lookup.zip

dustymc commented 3 years ago

Ah gotcha. So the possibilities are to standardize on one or the other (scientific_name is more "coreward" so my halfhearted vote is to retain it and eliminate taxon_name), or to change the headers going one way or the other for that intermediate tool.

Jegelewicz commented 3 years ago

I say we stick with taxon name as that is more generic and what we are using elsewhere image

image

except here image

I think that (scientific name) part on the main search page should go away. Some identifications will not be scientific names, at least I don't think art collections will view them that way.

sharpphyl commented 3 years ago

If you look at suggestions for updating the UI on the Taxonomy Search page https://docs.google.com/document/d/1Egj161LQDJaXsINA3IxNJoPmqcbnXc5i/edit you'll see this statement that was proposed to expand taxonomy to include more than just biological items.

"In Arctos, Taxonomy is any classification system of things or concepts. The items may be biological, mineral, cultural or other types as proposed by Arctos users."

I'd vote for taxon name. See https://en.wikipedia.org/wiki/Taxonomy_(general)

Jegelewicz commented 3 years ago

@dustymc can you pull this for me again - I made a bonehead mistake.

dustymc commented 3 years ago

done

Jegelewicz commented 3 years ago

Again when you get a chance, please.

dustymc commented 3 years ago

done

Jegelewicz commented 3 years ago

One more time, please!

dustymc commented 3 years ago

done

Jegelewicz commented 3 years ago

Just tried something nuts. Pull when you get a chance.

dustymc commented 3 years ago

pulled

Jegelewicz commented 3 years ago

dang it, another boneheaded error, pull when you can.

dustymc commented 3 years ago

pulled

Jegelewicz commented 3 years ago

THX. Once more whenever you can.

dustymc commented 3 years ago

pulled