ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

How is API argument scientific_name treated? #7980

Closed camwebb closed 1 month ago

camwebb commented 1 month ago

I'm writing some user documentation on how to search for names with different levels of specificity. The 'Identification' search box (which sends scientific_name to the API, AFAIK), seems not to be working as one might expect. Case in point: a record with two identifications, orders 1 and 2 (with ID 2, Claytonia arctica, being a temporary, dummy ID). Observations:

My confusion appears to be caused by not understanding how the API scientific_name is constructured.

How then can the user, in the main search page (not via the taxonomy pages):

Thanks. I'll write up my findings in the handbook.

camwebb commented 1 month ago

Partially answered this myself. Perhaps a dumb Q after all. The Identification box searches flat.scientific_name which is a generated field with a single name chosen from all the identifications, usually for identification_order = 1.

How a choice of identification order (not in flat) is combined with a search of flat.scientific_name, generating the behavior of observation 2 in the OP above, I have no idea. I think this is still an issue, because giving the user the ability to combine a name search with an identification order and for it not to produce the expected results seems like a problem.

If anyone is interested, the ALA internal documentation is here.

mkoo commented 1 month ago

@camwebb Thanks Cam for writing up these tech outlines for the user-- valuable! And yes you did answer yourself for the main issue as well as the bit of the second question. We could have a more robust easy query for combining flat.scientific_name and id_order (or the choice of "give me it all in any order the name shows up") because ID_order can only do one at a time.

Are you trying to map out the ALA query types within Arctos? I can see adding a few tweaks here and documentation to help accommodate users. If so we can recast a functional needs so it can be reviewed properly

camwebb commented 1 month ago

Thanks @mkoo

We could have a more robust easy query for combining flat.scientific_name and id_order (or the choice of "give me it all in any order the name shows up") because ID_order can only do one at a time.

I think most users will reason that if there is an Identification box, and an Identification Order box in the main search interface, then they should work together, e.g., find records where name=A and order>1. Currently this is not how the page works (AFAIK). The ID order only works if there is a taxon_name_id=X. I suggest hiding ID order from the user in 'Customize', but make it appear if queried via the url, in exactly the same way that taxon_name_id cannot be selected, but will appear if in a url.

Are you trying to map out the ALA query types within Arctos?

I guess so. I'm developing documentation for users of ALA specimens - the way 'UAM Plants' can be used is different from other collections.

dustymc commented 1 month ago

Short answer for now, happy to elaborate later.

https://github.com/ArctosDB/PG/blob/master/includes/specimenSearchQueryCode__param.cfm will tell you what the API is doing.

https://github.com/ArctosDB/PG_DDL/blob/master/function/flat_components/update_flat_row.sql will tell you how things get into FLAT, when that's involved.

Issues (https://github.com/ArctosDB/arctos/issues/7695 is probably most relevant here) will tell you WHY things are as they are in FLAT.

'UAM Plants' can be used is different

I doubt it, everyone else will totally want to steal your good ideas.

Bigger picture, I'm happy to help you understand the details of Arctos's current functionality, but I'd also very much like to understand what you wish it did. Maybe I can use that as some sort of default, maybe I can't accommodate it at all (not likely here), but everyone will be better off if I know a bit more about your wishlist/ideal/view. (Arctos is a big pile of user's good ideas, hopefully-sometimes correctly understood and implemented.)

name=A and order>1

I think there may be some confusion between identification (and what's been done for 7695) and the taxonomy behind those? One considers whatever's been cached into FLAT, the other considers the source of that cache - and perhaps that line isn't as clean as it could be in the code, UI, both, documentation, etc.

https://arctos.database.museum/search.cfm?guid_prefix=UAM%3AHerb&taxon_name=Claytonia%20arctica&identification_order=2 considers taxon name and identification order.

camwebb commented 1 month ago

Great stuff @dustymc

github.com/ArctosDB/PG

I lost access to PG many months ago - didn't know that was where the active dev was. Please let me in.

github.com/ArctosDB/PG_DLL

Aha! That is the repo I've been looking for - the SQL sources. Please let me in here too!

WHY things are as they are in FLAT

I think these FLAT choices are good. Showing only name (and taxonomy) for IDorder = 1.

I'd also very much like to understand what you wish it did.

I wish it was possible in the main search interface to search for a name and an identification order (1, 2, >0, >1, etc.). I think this would require an additional lookup of submitted name to taxon_name_id. The UI structure I would suggest for this (if the user selects 'Identification Order' in 'Customize') is something that collects the search name and ID order together, and hides the 'Any Taxon' and 'Identification' boxes, which conflict. E.g.:

Default:

+------------+
|            | Any taxon, ID, common name (--> API:taxon_name_wide) 
+------------+
+------------+
|            | Identification (--> API:scientific_name = flat.scientific_name) 
+------------+

On customizing for Identification Order (hiding the above two boxes)

+------------+                                         +-------+
|            | Taxon name (--> API:taxon_name_exact)   |       | Identification Order
+------------+                                         +-------+

taxon_name_exact would be a direct lookup of names in 'taxon_name'. I've labelled the current API taxon_name as taxon_name_wide, becuase it looks through all the taxonomy and relationships.

...taxon_name=Claytonia%20arctica&identification_order=2 considers taxon name and identification order

is not what I am hoping for before because it will pull in related names, not just records matching the search name.

In this way, the user could search for records using:

  1. current single name (API:scientific_name = flat.scientific_name)
  2. (NEW) exact names in all or prior identifications (API:taxon_name_exact && identification_order)
  3. via related names (API:taxon_name_wide)
dustymc commented 1 month ago

UI structure

New Issue, that will 100% get lost here. (Users can generally do whatever they want, but we can set defaults and change order and such.)

suggest for this (if the user selects 'Identification Order' in 'Customize')

That's probably also a separate issue, but perhaps not so straightforward. The default UI has to work for a HUGE diversity of data and users, I'm not sure that will be possible (but of course I'm always up for whatever). There is a pretty nice API, if you'd like a UI that does exactly whatever you'd like it to....

'Any Taxon' and 'Identification' boxes, which conflict.

There's no conflict, but if you want to avoid the relationships and globalnames data and such then you want to avoid any taxon.

https://arctos.database.museum/search.cfm?guid_prefix=UAM%3AHerb&scientific_name=Carex%20lachenalii&identification_order=2&tax_trm_1=%3DClaytonia%20arctica&tax_src_1=UAM%20Plants

Screenshot 2024-08-05 at 07 46 01

is

https://arctos.database.museum/search.cfm?guid_prefix=UAM%3AHerb&identification_order=2&tax_trm_1=%3DClaytonia%20arctica&tax_src_1=UAM%20Plants

is the same taxonomy requirements, but entirely ignores the identification.

You can drop the = prefix to eg include subspecies.

You can drop the source to get at 'name used by....' rather than 'concept/classification/whatever you want to call that thing.....', or change the source to get at eg 'the world according to NCBI' (or anyone else whose data have somehow made it to Arctos).

I think that satisfies the request? If not let me know what I can do to help ("how do I find this thing by those parameters?" seems to be a useful approach), if so please let me know what we can do to make the documentation better.

camwebb commented 1 month ago

Thanks @dustymc. In our ALA documentation I showed users how to find these more restricted sets, using the taxonomy page to access taxon_name_id, and I plan to add this to the main documentation. So no need to fiddle with the UI. But this issue is here if there is UI discussion in the future.

Could you please let me see the PG and PG_DLL repos? Thanks

dustymc commented 1 month ago

I don't control repo access, @mkoo see https://github.com/ArctosDB/arctos/issues/7980#issuecomment-2292034990 (sorry if I'm not finding the procedure).