ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
61 stars 13 forks source link

Cleaning up of low information agents #4903

Closed lin-fred closed 1 year ago

lin-fred commented 2 years ago

@dustymc can you post a list of agents that only have remarks you'd like us to start looking through

AJLinn commented 2 years ago

But also - everyone may want to review the list of recently marked for merge agents to make sure they aren't losing something.

Thanks for the notice - I'm curious about these markings also... I had one particular agent marked who was associated with four transactions (recipient of loans). I added a couple of URLs and an "alive" relationship, but I am concerned about why this one got wrapped into the mass-merge.

Also, I am sorting by remark length and # of collections and will add to the existing Google sheet so that everyone can work on this collaboratively. Please work in the "New list" tab.

Are we being given time to do some editing of these agents before they get nuked, because I apparently have 918 on the "new list"???? I'm not quite sure what planned action this list represents...

Jegelewicz commented 2 years ago

@AJLinn I think @dustymc decided to go wild last week and merge a bunch of stuff, but I really want to wait on most merges until AFTER any string-only agents are converted to verbatim agents so that what ever people are using NOW is what they get THEN.

Also, just in case you have missed out. Any agent that has only names and remarks that is ONLY involved in collector roles, will be converted to verbatim agent around the first of next year. Any agents that you want to keep that are in this group need to have either a status, relationship, or address added to keep them agent-worthy. Also note that you can no longer add an agent without one of these things.

This is a plea to PLEASE allow the collections to do this work! Mass-mergeing stuff now is going to mean people losing verbatim information that they have currently recorded in agents.

AJLinn commented 2 years ago

Thanks @Jegelewicz - I've had to miss out on a bunch of Arctos stuff because of a 8-week long seminar I've been involved with (now completed) and have not been tracking agent stuff as closely as I should have. I am fully invested in doing this work but agree, we MUST have time to prioritize the work and it's not going to be a quick fix! Forcing us to drop all of the other time-critical priorities to do these fixes before things get merged is not going to engender a positive working environment! It's going to lead to mistakes and pissed-off users, NOT the intended improvement of agent records.

For example, for just UAM:EH dealing with 918 names on this google spreadsheet assuming it takes an average of 5 minutes per name (some will take much more effort while others will be faster) means 76.5 hours of work!!! I assume others have equally as many entries to review and none of us have two weeks of dedicated time to only devote to this task.

This is a plea to PLEASE allow the collections to do this work! Mass-mergeing stuff now is going to mean people losing verbatim information that they have currently recorded in agents.

1000-times YES!

Jegelewicz commented 2 years ago

just UAM:EH dealing with 918 names on this google spreadsheet assuming it takes an average of 5 minutes per name (some will take much more effort while others will be faster) means 76.5 hours of work!!!

Yep - I can get through about 20-25 in an hour. FWIW I have been spending some time each week just transferring information from remarks to the various status, relationship and address fields, so I am trying to help everyone get some of this done before the deadline. Also, @ArctosDB/agents-committee is meeting half an hour early each month to work on this too.

dustymc commented 2 years ago

allow the collections

That's what's happening, https://github.com/ArctosDB/arctos/issues/4930

krgomez commented 2 years ago

I'll work on adding more data to the UAM:Art agent profiles in the list. It can be challenging to research some of our more obscure artists, but I will do what I can. If an agent is a determiner of an attribute, does this disqualify them from being changed into a verbatim agent? Can you clarify, is the list shared in this issue all of the "low quality agents", or are there more?

Jegelewicz commented 2 years ago

If an agent is a determiner of an attribute, does this disqualify them from being changed into a verbatim agent?

Right now, yes.

is the list shared in this issue all of the "low quality agents", or are there more?

All as of the day it was made - but more agents get added every day...

marecaguthrie commented 2 years ago

Hi all- I had a full on panic attack about this and thought we were going to have to leave Arctos and find a different database for art collections. As we’ve talked about before it is very normal to have very little information about an artist (sometimes just an initial) but it still has enormous value and is a lot more data than nothing. The idea of losing that was so distressing- the creator field is probably our most legally/ethically important field to track in the collection. I was distraught that we would need to add a birth date because for living artists (living humans) they have the right to privacy about their birth date and are not required to provide it to us. Karinna has calmed be down by explaining that we can add “associated with X collection” to keep them from getting merged. Can I get confirmation that our creator agents with low data will not be erased/lost if they are formally associated with our collection? And that these are all the agents? There won’t be more in the future that will be eased if I’m not keeping tabs on Arctos discussions? This feels utterly terrifying to me. I haven’t read this paper and it’s recommendations in detail but it seems relevant to this conversation and I’m wondering if the working group has a policy/statement about about ethics/privacy for people who have personal info in Arctos? Or maybe something to develop if we don’t? https://mdsoar.org/handle/11603/14397

On Wed, Sep 14, 2022 at 1:31 PM Karinna Gomez @.***> wrote:

I'll work on adding more data to the UAM:Art agent profiles in the list. It can be challenging to research some of our more obscure artists, but I will do what I can. If an agent is a determiner of an attribute, does this disqualify them from being changed into a verbatim agent? Can you clarify, is the list shared in this issue all of the "low quality agents", or are there more?

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/4903#issuecomment-1247322770, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJKSRRZZVSQXSRIYP2ZT65LV6I73VANCNFSM55TVXMQQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Sent from Gmail Mobile

Jegelewicz commented 2 years ago

@marecaguthrie don't panic! We aren't asking anyone to add any information that isn't already in a biography - just to add it in some more appropriate places! I will make it a personal mission to look through your agents to ensure that you know if any of them are headed for the verbatim agent attribute.

BUT even that would not lose anything! If an agent is "verbatimized" to a verbatim agent attribute any remark associated with the agent (your biography) will also go into the verbatim agent remark and they could be "upgraded" to an agent at any time there is something other than name or remark to identify them.

if the working group has a policy/statement about about ethics/privacy for people who have personal info in Arctos? Or maybe something to develop if we don’t?

We really don't and we should but we do encumber certain agent information (all addresses except ORCiD, Wikidata and Library of Congress as those are already public). I will start an issue in the internal repo for this.

dustymc commented 2 years ago

There is still some very fundamental misunderstanding, or miscommunication, or misSOMETHING at play here.

Nothing can be lost; the defining characteristic of a verbatim agent is that the information fits in that structure without loss.

"Goes by single initial, prefers anonymity" is a great fit for verbatim agents; what we're doing does what you say you need to do much better than what we're coming from (where "A." would assuredly get credited with a bunch of unrelated low-information activity, and then probably changed to fit those misattributed data) possibly can.

Jegelewicz commented 2 years ago

@marecaguthrie here is an example - Litho-Krome Company

Before I did anything, this agent only had the following information

image

Had I left it alone, instead of this on the catalog record: image

You would have seen image verbatim agent Agent method by date remark
Litho-Krome Company creator Karinna Gomez 2019-02-18 Lithographic printing company in Columbus, Georgia.

For a real example: image

BUT I just took from remarks the "address" Columbus, Georgia and BOOM, now this is worthy of remaining an agent. With a few clicks, I was able to find their LinkedIN, and a Bloomberg page which both included a FULL address - and it appears this company is closed according to Google (plus their website is up for grabs). image

So, nothing that isn't already public was needed in order to "agentify" this strings-only agent and now nothing in your records will change nor will their public agent page except for the addition of the urls, which are already public. But also, the agent is more complete and others can tell if it is the same Litho-Krome Company they have in their data or if there is a new incarnation of this company name.

Hope that helps!

AJLinn commented 2 years ago

Question: If the low-quality agent gets moved to the Verbatim Agent attribute, will they show up if someone searches for the agent from the search page?

Screen Shot 2022-09-14 at 4 48 33 PM
dustymc commented 2 years ago

low-quality

Low information - I don't think that's the same as or even a decent proxy for 'quality.'

Screen Shot 2022-09-14 at 6 19 47 PM
ewommack commented 2 years ago

So to summarize (because I know this can be a bit confusing, and we've been working on this for a long time):

@dustymc @Jegelewicz @lin-fred @droberts49 do I have the summary right?

dustymc commented 2 years ago

My only objection is around the categorization of "down-graded." "Verbatimizing" is a lateral move, functionally equivalent to any other approach. Bigger-picture it should result in a much more information rich environment where things like duplicates (which prevent giving proper credit) are much less likely to exist, so while the path may not be direct I think the end result is inevitably an up-grading.

ewommack commented 2 years ago

@dustymc I changed the wording. What do you think?

dustymc commented 2 years ago

Nice, one more request - consider changing

You can search verbatim agents

to

You can search verbatim agents; no functionality is lost

ewommack commented 2 years ago

I think we should be good to go for the summary. Here is a clean version of it so we can link to this comment when we are discussing the issue. I'll try and keep track with developments and add to the summary as things come up:

So to summarize (because I know this can be a bit confusing, and we've been working on this for a long time):

krgomez commented 2 years ago

I finished going through and adding more data for the UAM:Art agents on the list attached in this issue.

mkoo commented 2 years ago

Assigning an archival database student to help. Just so we are all working on the same file to clean-up agents this is what I am sharing wtih Jihyun.

marecaguthrie commented 2 years ago

Thanks for explaining! That is reassuring!

On Wed, Sep 14, 2022 at 2:05 PM Teresa Mayfield-Meyer < @.***> wrote:

@marecaguthrie https://github.com/marecaguthrie don't panic! We aren't asking anyone to add any information that isn't already in a biography - just to add it in some more appropriate places! I will make it a personal mission to look through your agents to ensure that you know if any of them are headed for the verbatim agent attribute.

BUT even that would not lose anything! If an agent is "downgraded" to a verbatim agent attribute any remark associated with the agent (your biography) will also go into the verbatim agent remark and they could be "upgraded" to an agent at any time there is something other than name or remark to identify them.

if the working group has a policy/statement about about ethics/privacy for people who have personal info in Arctos? Or maybe something to develop if we don’t?

We really don't and we should but we do encumber certain agent information (all addresses except ORCiD, Wikidata and Library of Congress as those are already public). I will start an issue in the internal repo for this.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/4903#issuecomment-1247347651, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJKSRR5ZNJVG3VZSGMMXB43V6JD2TANCNFSM55TVXMQQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Sent from Gmail Mobile

dustymc commented 1 year ago

I think we're done here.