ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Feature Request - Bulk ID Update tool option - update all existing identifications by +1 #7983

Open genevieve-anderegg opened 2 months ago

genevieve-anderegg commented 2 months ago

Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html

Is your feature request related to a problem? Please describe. Here in DMNS:Inv we make extensive use of the ID order features in Arctos. We usually have ID order=1 for the current updated and accepted identification, ID order=2 for the legacy identifications with outdated names, and sometimes ID order=0 for old identifications that have been determined as incorrect. We check taxonomy updates occasionally as names get revised. We have specimens where the taxonomic name has been updated, so we need to add a new identification as Order=1, but change the older identification (previously order=1) to order=2, and if there is a legacy ID to order=3 (previously order=2). This structuring of identifications helps us track how a specimens identification has been changed as it's taxonomy has been updated. When we need to update the identifications to multiple records, we can add the new IDs, but then need to go into each individual record to changed the previously order=1 IDs to order=2, previously order=2 IDs to order=3, and so on. Could we add an option on the bulk ID update tool to do this? Effectively adding +1 to all previous IDs to move them "down" in the ID order (except, importantly, for ID order=0 identifications, because those have been determined as incorrect altogether)

Describe what you're trying to accomplish Easily update multiple IDs in a "cascading" manner of acceptedness.

Describe the solution you'd like Add a third option to this drop down that is "Update all existing IDs by adding +1" (or some other text explaining it that makes more sense). image

Describe alternatives you've considered Go through and change all the preexisting identification IDs in records one by one. Sometimes this is for many many records. Boo.

Additional context Related previous issue: https://github.com/ArctosDB/arctos/issues/7009

Priority Please assign a priority-label. Unprioritized issues gets sent into a black hole of despair.

dustymc commented 2 months ago

The identification bulkloader can do what's requested.

(And those real-time bulk update apps are not sustainable and should all be replaced with 'bulkloader writers'....)

campmlc commented 2 months ago

The real-time bulk update tools are super useful for quickly making changes to records pulled up into a search. Using bulkloaders to do the same thing, as they currently stand, would involve much more time and effort to download data, reformat, re-upload, fix errors, redo, etc, and would likely mean these actions would not get completed. @dustymc Are you suggesting a bulkloader writer that would autofill the necessary fields from the Search results Tools menu into a bulkloader ready csv that automatically goes to the appropriate component loader? That would be great.

dustymc commented 2 months ago

super useful

And super expensive in CPU, and somewhat expensive in maintenance/complexity/etc. by being largely redundant with more powerful tools.

autofill the necessary fields from the Search results

Identifiers has been doing this for quite a while, it's a MUCH more scalable process and I don't think it touches any of your listed concerns.

genevieve-anderegg commented 2 months ago

(And those real-time bulk update apps are not sustainable and should all be replaced with 'bulkloader writers'....)

Does that mean the kind of functionality provided by tools like this is going away? If we lose this kind of tool and make everything a process that requires bulkloading, Arctos will become intensely cumbersome and laborious. I do not bulkload that often at all, and for making a few small changes to only a handful of records, bulkloading is NOT the way to go. If I have to figure out a spreadsheet to make a few small changes to upload and then unload the old data, it will take me more than twice as long as for small changes. I hope this isn't the case!!!

dustymc commented 2 months ago

spreadsheet

@genevieve-anderegg that is not at all what I'm suggesting, I've started https://github.com/ArctosDB/arctos/discussions/8014 to discuss how (and/or if) this might be implemented.

sharpphyl commented 2 months ago

that is not at all what I'm suggesting

Phew. Thanks for clarifying. Can you address the original request? Is it possible to add a third option to "do nothing" and "set all existing IDs to order=0".

The third option would be "set all order=1 to order=2 and all order=2 to order=3" or some similar structure that moves an existing ID to the next higher (by count) order so this doesn't have to be done manually.

genevieve-anderegg commented 2 weeks ago

Phew. Thanks for clarifying. Can you address the original request? Is it possible to add a third option to "do nothing" and "set all existing IDs to order=0". The third option would be "set all order=1 to order=2 and all order=2 to order=3" or some similar structure that moves an existing ID to the next higher (by count) order so this doesn't have to be done manually.

Checking in on this. I don't think it matters if it is a component loader or a real time loader (as long as the tool makes clear with a message saying how long these requests might take), but a tool like this would be very useful. In DMNS:Inv we update taxonomy (and therefore identifications) frequently, as WoRMS via Arctos is our resource. We add new identifications frequently, and use the identification order capabilies along with that. This makes for many identifications. The requested tool would be so useful and timesaving.

Reiterating and agreeing with Mariel above:

The identification bulkloader can do what's requested.

The real-time bulk update tools are super useful for quickly making changes to records pulled up into a search. Using bulkloaders to do the same thing, as they currently stand, would involve much more time and effort to download data, reformat, re-upload, fix errors, redo, etc, and would likely mean these actions would not get completed.

Do we need to talk about this and https://github.com/orgs/ArctosDB/discussions/8014 in the WG/Issues meetings to make this happen?

dustymc commented 2 weeks ago

talk about this and https://github.com/orgs/ArctosDB/discussions/8014

Please (but I'm less-sure of the most appropriate venue).

Adding another thing that kinda does what an existing thing does seem somewhere between "a bit sub-optimal" and "unsustainable." (And I keep hearing that Arctos is too complex, which might be a hard push towards the latter.)

Adding to the barely-functional (and occasionally fatal) thing doesn't seem great (as above - significant support would obviously change that picture, nothing here is any kind of One True Solution, I'm just trying to suggest things that seem realistic and within reach and such).

Replacing the existing thing with a much more powerful thing which does the same things (and more, and much more scalably, albeit less-instantly) seem like an improvement.

The existing thing has some weird not-identification part-container-something stuff as well, I think @DerekSikes might be the only user, not sure what to do with that. (A "write to these two loaders" widget might handle it and seems technically plausible, but not sure I understand anything enough to even propose, much less write, such things.)

I don't think there are any (new) technical limitations here, but there's a significant question of sustainability (both big-picture and specifics involving computational, development/maintenance, and usability); direction from The Community would be most useful.

campmlc commented 2 weeks ago

I support adding to existing tools for the immediate need, then seeking resources and experimenting with a component loader system the longer term.

dustymc commented 2 weeks ago

adding to existing tools for the immediate need,

In case I somehow haven't been clear on this: This to me seems like the worst possible solution, unsustainable in at least the three ways I mentioned in my previous comment.

then seeking resources

In support of anything in particular?

experimenting with a component loader system

I don't know what this experiment would involve, I'm just proposing a straighter path to existing tools.

campmlc commented 2 weeks ago

Adding to the barely-functional (and occasionally fatal) thing doesn't seem great (as above - significant support would obviously change that picture, nothing here is any kind of One True Solution, I'm just trying to suggest things that seem realistic and within reach and such).

This may be barely functional from a database perspective but it is the most functional and desirable from a user functionality and experience perspective. If significant support would make this possible, we have consensus above that it would be the preferred option.

As an alternative, we need to find a way to make this: https://github.com/orgs/ArctosDB/discussions/8014 as simple, clear, and easy to use and understand as the above. It would have to involve autofill and ability to mark to load without having to leave the search results and find multiple scattered tools in other menus. We also must deal with the problem of a single user pipeline in the component loaders that causes unpredictable and unreasonable delays in data upload. One user should not be able to block workflows for all others, as is currently the case. I hesitate to move more things to component loaders until that is resolved.

Can't we just fulfill this current request for this unstable tool in the meantime while we look at more sustainable and user appropriate options? We are losing data and preventing work. @mkoo

DerekSikes commented 2 weeks ago

"The existing thing has some weird not-identification part-container-something stuff as well, I think @DerekSikes might be the only user, not sure what to do with that. "

I use this a lot! Whenever I update identifications of specimens I am almost always moving them at the same time, so having a single tool that takes the same catalog records and (1) allows ID updates and (2) container updates is golden. Note also, that moving specimens (container stuff) is part of identification work, since we physically arrange specimens according to their identifications (and yes we use barcodes too, but being able to find all the specimens of taxon X in physical proximity is critical to efficient collections management, identification work, and taxonomic research).

dustymc commented 2 weeks ago

Thanks @DerekSikes! That seems easy enough - filter parts however you want, can be the same UI, send those data to Container: Bulk Move. (And it would be easy enough to get rid of "This form will NOT install parts." and send those to Part: Bulkload to Containers if that's useful.)

DerekSikes commented 2 weeks ago

Um... sounds like you are suggesting the form to be changed? Or are you talking about "behind the scenes stuff" that I needn't care about? Also, despite having used Arctos since 2012 I still have no idea what 'install parts' means. I've been using the object tracking system all that time so it seems like this lack of understanding hasn't handicapped me.

dustymc commented 2 weeks ago

suggesting the form to be changed?

No, or at least not necessarily or significantly. (This request would require the addition of a "and for existing IDs...." option, and possibly something would be best relabeled to fit in with the component loader environment or something.) If something's wonky this would be a great time for update, but I don't think any (major) visible changes would be necessary.

behind the scenes stuff

Exactly. (See https://github.com/orgs/ArctosDB/discussions/8014 for some not-quite-tech stuff, but basically Oracle and PostgreSQL do things differently and the code needs to acknowledge that.)

no idea what 'install parts' means

"Install" is to install the part (the part-container really, all parts are also containers and those are what can interact with other containers, but feel free to not know that) into a container capable of carrying a barcode - a NUNC tube or pin or such.

"Move" container puts a container capable of carrying a barcode into another barcode-capable container - the tube into a freezer box, the pin into a unit tray, etc.

Basically "install" has to deal with nearly-unidentifiable parts, "move" deals with easily-identifiable part-holders.

campmlc commented 2 weeks ago

So would this change populate files in the component loaders that would still need to be marked to load?

genevieve-anderegg commented 2 weeks ago

So would this change populate files in the component loaders that would still need to be marked to load?

If this is the case for a workable solution, I support it. Would still result in fewer clicks, and not having to create a csv and bulkload identifications, which is the main thing we are trying to avoid as that is the most time consuming step