ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

unsafe operator #5156

Closed dustymc closed 1 year ago

dustymc commented 1 year ago

https://arctos.database.museum/agents.cfm?agent_id=21346272

@wellerjes I don't want to interrupt anything but group accounts are not safe nor allowed and I'm going to nuke it - but I can wait a bit if you're in the middle of something.

https://handbook.arctosdb.org/documentation/users.html

dustymc commented 1 year ago

And https://arctos.database.museum/agents.cfm?agent_id=21346270

campmlc commented 1 year ago

Just checked, and this is for an iDigBio digitization event, and the permissions are just data entry and coldfusion user. I don't see how an event like this could be run without using something like this.

On Wed, Oct 12, 2022 at 12:55 PM dustymc @.***> wrote:

  • [EXTERNAL]*

https://arctos.database.museum/agents.cfm?agent_id=21346272

@wellerjes https://github.com/wellerjes I don't want to interrupt anything but group accounts are not safe nor allowed and I'm going to nuke it - but I can wait a bit if you're in the middle of something.

https://handbook.arctosdb.org/documentation/users.html

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5156, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBE6OG6HC5D3MGITZ4LWC4CRRANCNFSM6AAAAAARDSBD3E . You are receiving this because you are subscribed to this thread.Message ID: @.***>

campmlc commented 1 year ago

I support allowing these types of agent/operators with very limited permissions- otherwise we would have to exclude Arctos collections from being able to participate in these kinds of events.

On Wed, Oct 12, 2022 at 1:01 PM Mariel Campbell @.***> wrote:

Just checked, and this is for an iDigBio digitization event, and the permissions are just data entry and coldfusion user. I don't see how an event like this could be run without using something like this.

On Wed, Oct 12, 2022 at 12:55 PM dustymc @.***> wrote:

  • [EXTERNAL]*

https://arctos.database.museum/agents.cfm?agent_id=21346272

@wellerjes https://github.com/wellerjes I don't want to interrupt anything but group accounts are not safe nor allowed and I'm going to nuke it - but I can wait a bit if you're in the middle of something.

https://handbook.arctosdb.org/documentation/users.html

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5156, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBE6OG6HC5D3MGITZ4LWC4CRRANCNFSM6AAAAAARDSBD3E . You are receiving this because you are subscribed to this thread.Message ID: @.***>

dustymc commented 1 year ago

This is primarily a security issue. Support in the form of addressing that is most welcome.

Second - and maybe it shouldn't be - Arctos isn't made to support that, and it's probably going to be a crappy user experience.

I don't know what's going on, but I'm relatively sure that given a little warning I could come up with something that doesn't constitute a security risk or involve making more agent messes.

wellerjes commented 1 year ago

@dustymc Thanks for bringing this to my attention. We have a WeDigBio event this Friday/Saturday where participants will be transcribing labels directly into Arctos. There are approximately 10 participants each day; rather than creating an account for each participant and have to manage operator/user permissions for each one, we wanted a general user account that they could use the day of. They will only be using the data entry form--I have a profile/template set up for use--under staff supervision. I've entered the participants into Arctos separately, under volunteers of Chicago Academy of Sciences.

dustymc commented 1 year ago

I can wait until next week to do anything.

Data entry may be very twitchy in that scenario - changes to the layout are likely to bounce around between computers sharing the account. There's no way I can do anything in the next 2 days, but that might be a good use case for some kind of specialized form (maybe even in something like google sheets). ANYWAY - it's worth testing what you're planning before you end up with a room full of frustrated users (who will blame Arctos, which isn't great).

wellerjes commented 1 year ago

Thank you! We just did a test run with three people and everything went smoothly.

For more background on the event: we'll have the workstations signed in prior to the event, so we won't be giving out passwords / user information. There will be 4 staff members overseeing 10 to 11 participants each day, with 6 or 7 actively working on data entry. We have a Google Form backup created in case we encounter any issues.

It would be great to have a feature for transcription events in the future, especially for transcription-centric events like WeDigBio. (As a side note: We are also preparing to use Zooniverse in future events with materials that are already digitized, so we would not be relying solely on Arctos for all transcription events.) We have done a transcription event like this before with Google forms, but it takes a lot of time to clean the data to get it ready for Arctos afterwards, and we'd really like to include this as part of the participant experience.

We can follow up and let you know how the experience went after our event.

mkoo commented 1 year ago

I am all for transcription events! yay! However, I just want to figure out/ ensure we have a good set up here both for Arctos, which is not made for crowd sourced events but trained user, and for the volunteers. But even for wikipedia and other public data entry platforms, users still register to get 'credit' and "incentivized" -- so maybe we could have run down on your event, what works, what could be better at an AWG meeting @jessica and thus be better prepared for the next one. An aside, another transcription platform to explore is FromthePage, which we tried to get funded for some Arctos integration.

On Wed, Oct 12, 2022 at 3:02 PM Jessica Weller @.***> wrote:

Thank you! We just did a test run with three people and everything went smoothly.

For more background on the event: we'll have the workstations signed in prior to the event, so we won't be giving out passwords / user information. There will be 4 staff members overseeing 10 to 11 participants each day, with 6 or 7 actively working on data entry. We have a Google Form backup created in case we encounter any issues.

It would be great to have a feature for transcription events in the future, especially for transcription-centric events like WeDigBio. (As a side note: We are also preparing to use Zooniverse in future events with materials that are already digitized, so we would not be relying solely on Arctos for all transcription events.) We have done a transcription event like this before with Google forms, but it takes a lot of time to clean the data to get it ready for Arctos afterwards, and we'd really like to include this as part of the participant experience.

We can follow up and let you know how the experience went after our event.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5156#issuecomment-1276780239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATH7ULDV2KGCI6JSGYZHJ3WC4YQHANCNFSM6AAAAAARDSBD3E . You are receiving this because you are subscribed to this thread.Message ID: @.***>

dustymc commented 1 year ago

In addition to the security and potential UI problems,

  1. This is a big mess of noncompliant Agents that must be cleaned up (and that's going to require my scary password and likely some downtime)
  2. The people ultimately aren't going to get proper attribution for their work
  3. This almost certainly violates TACC's access policies

So what "works" here isn't necessarily going to be sustainable.

(I suspect the solution is in how we create users, but I'm pretty open to about anything.)

Jegelewicz commented 1 year ago

My suggestion here is a Google form that puts data in the bulkloader format would be a much better solution. It would allow for cleanup before things get loaded and I think a better user experience. With just a little prep the higher geography, agents, taxonomy, and other code table vocabs can be loaded into the form (or use the Arctos API to just get at it directly). FWIW, this is what I did with a group of students in a WeDigBio event back in 2017 without the link to Arctos controlled vocabularies and it still worked out pretty well, saving me all of the initial transcription time.

I do think that "verbatim" things get lost when people enter data directly into Arctos and I feel pretty certain that no matter how well it is explained, first time data entry peeps will not understand the locality stack.

Multiple people entering data under a single username will not let you easily understand which of your volunteers made any messes. Why not have each volunteer create a profile? It only takes five minutes if you do it in person, first thing.

campmlc commented 1 year ago

We've been needing to make changes to our user/operator model for awhile, for example, to deal with operators who switch collections or need to have different levels of permissions for different collections. We already have to make public and private accounts for the same user - one as operator, one as public user. And I know that people have repeatedly tried to create alternate operator accounts for the same user by using a different name or email. This means our current model isn't working for our needs as users. Perhaps we need different "tiers" of operator access? Rather than assigning permissions individually, each tier comes with a certain level of access? Then we could hard restrict the access to allow data entry only for the lowest or second lowest tier? And hard link that to the public agent?

On Fri, Oct 14, 2022 at 5:46 AM dustymc @.***> wrote:

  • [EXTERNAL]*

In addition to the security and potential UI problems,

  1. This is a big mess of noncompliant Agents that must be cleaned up (and that's going to require my scary password and likely some downtime)
  2. The people ultimately aren't going to get proper attribution for their work
  3. This almost certainly violates TACC's access policies

So what "works" here isn't necessarily going to be sustainable.

(I suspect the solution is in how we create users, but I'm pretty open to about anything.)

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5156#issuecomment-1278896639, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBB75GV6TFKRVP2NR73WDFBYJANCNFSM6AAAAAARDSBD3E . You are receiving this because you commented.Message ID: @.***>

campmlc commented 1 year ago

a Google form that puts data in the bulkloader format- Why not have each volunteer create a profile? It only takes five minutes if you do it in person, first thing.< This is what we did for a boy scout troop digitization project last spring at MSB. I had each pair of digitizers enter data into a separate tab in a Google Sheet, with standardized headers. They also took pictures of the labels, and I created a google photos album for them to share all the photos.

But we still do need to deal with the Agent problem.

wellerjes commented 1 year ago

Thanks for the feedback! We are using the Google Form for the event. We definitely do not want to violate TACC's access policies.

Our main goal with using Arctos for this event is to give people a better idea of the work that goes on behind the scenes. We were planning to review every catalog record in the bulkloader before it was uploaded. Internally, our hope was to eliminate the extra step of cleaning date from the Google form and have more of the information standardized as it went into Arctos.

For the other concerns:

AGENTS: I have entered each participant as an agent and created an association with the Chicago Academy of Sciences as a volunteer. I have not entered anyone without proper documentation required to create new agents. If we move forward with using Arctos for public transcription events in the future, we can include that step (you need to create an account in Arctos) in the in the sign-up process. I do like Mariel's idea of having different tiers of public user access. We have created accounts for long-term users (interns and volunteers), but not for single-day event volunteers. Has anyone done this before and how did you handle it?

CREDIT: Individuals will get credit for their work through the processing history attribute as the "Determiner". We have had transcription events in the past using Google Forms, and have made sure each volunteer received credit for their transcription work by recording their name in the Arctos record in some way (typically Remarks, but going forward, processing history/label transcribed).

Jegelewicz commented 1 year ago

To get all that you want, the best thing is to have each participant create an operator account that is associated with their agent. Grant them data entry for the appropriate collection, then when the event is over just remove their access to the collection. The great thing about this is when they come back next year, you just give them collection access and they are off! All of their work will be on their agent page and everyone wins!

campmlc commented 1 year ago

But this would be really difficult for mass online digitization events. I do like the idea of being able to let people use the data entry interface and all the controlled vocabulary, but it can be difficult when they hit a taxon name or agent or geography that is not in Arctos yet. That can be super frustrating to novice users. Any way to add controlled vocabulary to a google sheet or Excel?

On Fri, Oct 14, 2022 at 9:50 AM Teresa Mayfield-Meyer < @.***> wrote:

  • [EXTERNAL]*

To get all that you want, the best thing is to have each participant create an operator account that is associated with their agent. Grant them data entry for the appropriate collection, then when the event is over just remove their access to the collection. The great thing about this is when they come back next year, you just give them collection access and they are off! All of their work will be on their agent page and everyone wins!

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5156#issuecomment-1279179241, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBGESZP4PH3P7ABCYZDWDF6NRANCNFSM6AAAAAARDSBD3E . You are receiving this because you commented.Message ID: @.***>

Jegelewicz commented 1 year ago

Any way to add controlled vocabulary to a google sheet or Excel?

Yes - there definitely is, but that doesn't accomplish the "credit" - maybe we need an attribute for this?

dustymc commented 1 year ago

Google Form

A few of us have long believed that data entry forms should be treated a bit like labels, where everyone has a few of them - one for bats, one for seals, one super generic one that does anything but isn't much fun, etc., etc. The technology (including Arctos' API) might actually be ready to support that now, someone should take a hard look at what's possible.

taxon name

This has been a problem before, it shouldn't be now - grab your local checklist, load it to your form, load it to Arctos as a Source. Worst case, A {string} exists: use the genus-or-whatever and sort it out later.

agent

Verbatim agents covers this (but I'm not sure it's an actual issue for most transcription use cases??)

geography

That's all being cleaned now, it's (becoming) easy to predict what's there, or (with coordinates) 'we refuse to say' is as functional as anything else.

Any way to add controlled vocabulary to a google sheet

Yes, the question is if you can plug it into Arctos' API, and then if you can cache that for field work (but you might want to use something like Excel for that).

mkoo commented 1 year ago

@campmlc yes, you can export tables from Arctos to provide controlled lists in either spreadsheet program-- just requires a little set up and fresh exports

@Jegelewicz attribution to an individual -- why not as an agent? or at the least verbatim agent. Seems like we have this in place.

I think an external set up could be created with some of the scripting options that google docs provide so this could be mainly managed outside of Arctos, data prepped externally, stats compiled for events then imported to Arctos. Maybe this kind of event could be captured and tracked in a Project too

Jegelewicz commented 1 year ago

This is where we could save time as a community. As it is, everyone is on their own to create these kinds of events. If we worked together, we could come up with a usable system that any Arctos collection could pick up and run with. WeDigBio happens like twice a year and offering an easy path for Arctos collections to participate would be a selling point. @wellerjes @droberts49 it's probably late for you guys at this point, but can we set up maybe a half day workshop so that you can present what you did and we can come up with something that any collection manager could pick up and run with next time?