ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
Apache License 2.0
57 stars 13 forks source link

Do we need agent guardrails? (was: funky agents) #7649

Open dustymc opened 3 months ago

dustymc commented 3 months ago

Do we need rules or guidance around agents?

I've noticed some not-great agent data being created, I don't know if @ArctosDB/agents-committee would care to attempt to establish any guardrails or if this is fine or ?? Please advise, or close if nobody cares.

Possible Actions

Examples and ponderings and such follow


Here are agents with nonunique preferred name:


select
    agent_id,
    agent_type,
    preferred_agent_name,
    getpreferredagentname(created_by_agent_id) creator,
    created_date
from 
    agent
where
    preferred_agent_name in (select preferred_agent_name from agent group by preferred_agent_name having count(*) > 1)
order by preferred_agent_name,created_date desc;

 agent_id |  agent_type  | preferred_agent_name |         creator          |        created_date        
----------+--------------+----------------------+--------------------------+----------------------------
 21352027 | person       | Allison Nelson       | Jonathan L. Dunnum       | 2024-03-25 08:55:51.725221
 21350942 | person       | Allison Nelson       | Katherine L. Anderson    | 2024-01-31 14:53:50.530417
 21301738 | person       | Ben D. Marks         | Charles M. Dardia        | 2016-06-28 13:32:51
 21248037 | person       | Ben D. Marks         | unknown                  | 2013-12-16 21:49:31
 21351392 | person       | Bruce B. Paige       | Derek S. Sikes           | 2024-03-05 13:38:17.074186
 21348083 | person       | Bruce B. Paige       | Teresa J. Mayfield-Meyer | 2023-04-17 16:58:23.684949
 21351197 | person       | David Johnson        | Derek S. Sikes           | 2024-02-26 19:05:23.736418
 21295057 | person       | David Johnson        | Dusty L. McDonald        | 2015-10-06 11:30:13
 21352097 | organization | DOI Foundation       | C. O. Webb               | 2024-04-03 20:28:58.545203
 21348956 | organization | DOI Foundation       | Dusty L. McDonald        | 2023-06-29 08:16:07.045192
 21351450 | person       | G. S. Tulloch        | Derek S. Sikes           | 2024-03-05 13:38:17.997551
 21351074 | person       | G. S. Tulloch        | Jayce Williamson         | 2024-02-12 13:45:24.589931
 21351805 | person       | Jared Hughey         | Derek S. Sikes           | 2024-03-05 13:38:29.414319
 21348137 | person       | Jared Hughey         | Justin Fulkerson         | 2023-04-21 18:07:15.093075
 21351445 | person       | J. Jacobs            | Derek S. Sikes           | 2024-03-05 13:38:17.907213
 21350919 | person       | J. Jacobs            | Jessica Weller           | 2024-01-27 11:25:38.328081
 21351651 | person       | Laura Lofgren        | Derek S. Sikes           | 2024-03-05 13:38:26.615875
 21349012 | person       | Laura Lofgren        | Jayce Williamson         | 2023-07-16 14:23:47.257662
 21333621 | person       | Lauren Wilson        | Zack Perry               | 2021-07-13 12:39:08.99178
 21300714 | person       | Lauren Wilson        | Erica Krimmel            | 2016-04-12 12:17:17
 21351943 | person       | Mary Ann Sundown     | Angela Linn              | 2024-03-09 20:57:52.329768
 21347114 | person       | Mary Ann Sundown     | Shealyn Golden           | 2023-01-27 02:25:06.394785
 21351324 | person       | R. Leiner            | Derek S. Sikes           | 2024-03-05 13:37:47.64515
 21349050 | person       | R. Leiner            | Jayce Williamson         | 2023-07-26 15:59:10.979065

and recent person-agents - many of which are clearly not persons - without a first or last name:

select
    agent.agent_id,
    preferred_agent_name,
    getpreferredagentname(agent.created_by_agent_id) creator,
    agent.created_date
from 
    agent
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in ('first name','last name')
where
    agent.agent_type='person' and
    agent_attribute.attribute_id is null 
    and agent.created_date > current_date - interval '1 year' -- remove this line for all, its too much to paste here
order by agent.created_date desc
;
 agent_id |                  preferred_agent_name                  |     creator     |        created_date        
----------+--------------------------------------------------------+-----------------+----------------------------
 21352122 | Jack Spratt                                            | Jozef A. Slowik | 2024-04-05 12:12:44.141425
 21352119 | C. Stillman                                            | Jozef A. Slowik | 2024-04-04 16:05:10.937667
 21352117 | B. S. Blitz                                            | Jozef A. Slowik | 2024-04-04 15:34:50.688057
 21352116 | F. Sorensen                                            | Jozef A. Slowik | 2024-04-04 15:25:19.630444
 21352109 | M. Rosy                                                | Jozef A. Slowik | 2024-04-04 12:59:13.243463
 21352057 | unrecorded                                             | Angela Linn     | 2024-03-26 16:19:24.093752
 21351875 | Bundtzen                                               | Derek S. Sikes  | 2024-03-05 13:38:39.295728
 21351874 | Sid                                                    | Derek S. Sikes  | 2024-03-05 13:38:39.285538
 21351873 | Kenai Veterinary Clinic                                | Derek S. Sikes  | 2024-03-05 13:38:39.275253
 21351872 | Schmidt                                                | Derek S. Sikes  | 2024-03-05 13:38:39.262718
 21351870 | Chester                                                | Derek S. Sikes  | 2024-03-05 13:38:39.226274
 21351818 | Snarski                                                | Derek S. Sikes  | 2024-03-05 13:38:29.660891
 21351809 | Lucas                                                  | Derek S. Sikes  | 2024-03-05 13:38:29.487845
 21351798 | Buck                                                   | Derek S. Sikes  | 2024-03-05 13:38:29.300602
 21351784 | Galena Butterfly Festival Participants                 | Derek S. Sikes  | 2024-03-05 13:38:29.048438
 21351782 | Dick H. Bishop                                         | Derek S. Sikes  | 2024-03-05 13:38:29.023773
 21351746 | Femaida                                                | Derek S. Sikes  | 2024-03-05 13:38:28.354653
 21351744 | Challet                                                | Derek S. Sikes  | 2024-03-05 13:38:28.329927
 21351719 | Southside Animal Hospital                              | Derek S. Sikes  | 2024-03-05 13:38:27.869955
 21351701 | Gwichin Renewable Resources                            | Derek S. Sikes  | 2024-03-05 13:38:27.544315
 21351685 | v. Doesburg                                            | Derek S. Sikes  | 2024-03-05 13:38:27.270269
 21351611 | et al.                                                 | Derek S. Sikes  | 2024-03-05 13:38:25.915341
 21351598 | Sjodin                                                 | Derek S. Sikes  | 2024-03-05 13:38:25.657055
 21351592 | L. Shults                                              | Derek S. Sikes  | 2024-03-05 13:38:20.591105
 21351570 | Schuh & Gray                                           | Derek S. Sikes  | 2024-03-05 13:38:20.233703
 21351546 | R. Latta                                               | Derek S. Sikes  | 2024-03-05 13:38:19.846565
 21351543 | S. Craig                                               | Derek S. Sikes  | 2024-03-05 13:38:19.80164
 21351542 | Bio 116 students                                       | Derek S. Sikes  | 2024-03-05 13:38:19.790886
 21351511 | Waterways Vet Clinic                                   | Derek S. Sikes  | 2024-03-05 13:38:19.29779
 21351485 | Fran & Pete                                            | Derek S. Sikes  | 2024-03-05 13:38:18.724329
 21351479 | Smithhisher                                            | Derek S. Sikes  | 2024-03-05 13:38:18.612474
 21351478 | Unalakleet School students                             | Derek S. Sikes  | 2024-03-05 13:38:18.601517
 21351472 | Christian F. Weisser                                   | Derek S. Sikes  | 2024-03-05 13:38:18.499276
 21351451 | D. A. P.                                               | Derek S. Sikes  | 2024-03-05 13:38:18.021055
 21351438 | Stoneman                                               | Derek S. Sikes  | 2024-03-05 13:38:17.796991
 21351435 | Chriska Derr                                           | Derek S. Sikes  | 2024-03-05 13:38:17.748558
 21351434 | IAS                                                    | Derek S. Sikes  | 2024-03-05 13:38:17.738767
 21351433 | Lehman                                                 | Derek S. Sikes  | 2024-03-05 13:38:17.721695
 21351430 | Pam's pet grooming                                     | Derek S. Sikes  | 2024-03-05 13:38:17.680503
 21351418 | Calkins                                                | Derek S. Sikes  | 2024-03-05 13:38:17.505252
 21351396 | College Village Animal Clinic                          | Derek S. Sikes  | 2024-03-05 13:38:17.138685
 21351393 | Southeast Alaska Animal Medical Center                 | Derek S. Sikes  | 2024-03-05 13:38:17.094364
 21351381 | Roberts                                                | Derek S. Sikes  | 2024-03-05 13:38:16.894943
 21351347 | Stream Ecology Class UAF                               | Derek S. Sikes  | 2024-03-05 13:38:16.3071
 21351345 | Plant Protection Division Ministry of Agriculture USSR | Derek S. Sikes  | 2024-03-05 13:38:16.274119
 21351302 | Taiga                                                  | Derek S. Sikes  | 2024-03-05 13:37:47.281312
 21351297 | DHS                                                    | Derek S. Sikes  | 2024-03-05 13:37:47.19272
 21351294 | Wasilla Veterinary Clinic                              | Derek S. Sikes  | 2024-03-05 13:37:47.138723
 21351282 | McCarthy                                               | Derek S. Sikes  | 2024-03-05 13:34:13.480942
Jegelewicz commented 3 months ago

A bunch of these are @DerekSikes is there a reason the existing agents aren't being found?

dustymc commented 3 months ago

If you mean the nonunique names, that's probably me recovering verbatims that had been re-created post-verbatimization (and blaming Derek for it!). Likely one of them should be marked a duplicate of the other, but there's not enough information for me (nor anyone else, I suspect....) to actually determine that.

Jegelewicz commented 3 months ago

Going forward, I am going to favor Snarski over unknown.... :-)

DerekSikes commented 3 months ago

While I'm here can someone explain the difference between 'create person' and 'create agent'? Apparently use of Arctos for over 10 years hasn't gotten me to an advanced enough knowledge level to know this :)

DerekSikes commented 3 months ago

If one asserts that two agents are the same (one is a bad duplicate) I assume all or any metadata for the bad duplicate is lost? Is this true?

I can see where two agent records are duplicates but both have different metadata and a 'merging' of metadata would be helpful.

DerekSikes commented 3 months ago

Pretty sure 'et al.' is a mistake - this is often used on labels without enough space to write all the agents involved so indicates "and others" but would only be used in conjunction with an actual agent. eg. https://arctos.database.museum/guid/UAM:Ento:275837 has collector Amy M. Runck et al.

but instead of 2 collectors with one being 'et al.' there should be only Amy M. Runck et al. as a single string?

Or maybe we should keep the et al. agent so we don't need to have a separate agent for every real person who might end up in an et al. situation?

dustymc commented 3 months ago

difference

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctagent_type

(And maybe that's all silly - table person is decades-gone, the rules which replaced it are now gone, the philosophy which required the rules may be gone - but I'm clearly having a hard time getting over that, hence this issue! - maybe type just isn't important and we're pointlessly making things difficult for ourselves, IDK.)

assume

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctagent_attribute_type#bad_duplicate_of

Nothing's lost, nothing's automated, users just get a notification suggesting an update.

'merging' of metadata would be helpful

Absolutely - and that's a job for a human.

mistake

You can make it a bad duplicate of something, then use....

Screenshot 2024-04-09 at 04 53 23

... to update any records.

separate agent for every real person who might end up in an et al. situation?

Definitely not something I'd be happy to see, but I'm not hearing a lot of guidance from @ArctosDB/agents-committee at the moment....

To expand slightly on that and wander even farther from the topic at hand, I'm not happy to see "sorry, you're not very important to us" in ANY form, even when it's not me personally mushed into the et al. or faceless agency (see the host's attribute determiner) or whatever. Maybe I'm just being twitchy - I don't have field notebooks, we were recording data directly on AF sheets, the "traditional" importance of collector is perhaps less-relevant here - but I still think proper attribution is in general about the second-most important thing that museums can do (the first being solid links between literature and material).

DerekSikes commented 3 months ago

Thanks. And for the record... our best practices protocol is to record all collectors in Arctos, but then for our tiny insect labels we often abbreviate to et al. The problem arises when someone makes labels BEFORE the data are captured and all we have are the tiny labels to enter the data from (retroactive data capture) - which is 99%+ of what most entomology collections are dealing with. UAM Entomology is a rare exception in that most of our data capture is prospective (before labeling).

dustymc commented 2 months ago

moved to https://github.com/ArctosDB/arctos/issues/7550#issuecomment-2048418496

dustymc commented 2 months ago

More fun, there are a lot of agents being created with various could-be-important data stuffed into remarks.

We need a webinar, or some sort of education campaign??


select
    agent.agent_id,
    preferred_agent_name,
    getpreferredagentname(agent.created_by_agent_id) creator,
    agent.created_date,
    rem.attribute_value
from 
    agent
    inner join agent_attribute rem on agent.agent_id=rem.agent_id and rem.attribute_type in ('remarks')
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in (
        select attribute_type from ctagent_attribute_type where purpose in ('address','event')
    )
where
    agent.agent_type='person' and
    agent_attribute.attribute_id is null and
     agent.created_date > current_date - interval '3 months'
order by agent.created_date desc
;

 agent_id |    preferred_agent_name    |        creator         |        created_date        |                                                  attribute_value                                                  
----------+----------------------------+------------------------+----------------------------+-------------------------------------------------------------------------------------------------------------------
 21352220 | L. Mironova                | Elena Taboko Taku      | 2024-04-12 11:50:41.997135 | Collected plants in Russia in 1986.
 21352218 | A. Nikulin                 | Elena Taboko Taku      | 2024-04-12 11:44:36.500603 | Collected plants in Russia in 1987.
 21352217 | V. Zemtsov                 | Elena Taboko Taku      | 2024-04-12 11:37:40.970177 | Collected plants in Russia in 1987.
 21352216 | V. F. Ezrailson            | Elena Taboko Taku      | 2024-04-12 11:32:48.073711 | Collected plants in Russia in 1966
 21352215 | I. Belskaya                | Elena Taboko Taku      | 2024-04-12 11:25:12.524505 | Collected plants in Russia in 1979.
 21352214 | T. Ishkova                 | Elena Taboko Taku      | 2024-04-12 11:19:29.117051 | Collected plants in Russia in 1983.
 21352213 | O. Babarykina              | Elena Taboko Taku      | 2024-04-12 11:13:00.603817 | Collected plants in Russia in 1984.
 21352212 | G. Liventseva              | Elena Taboko Taku      | 2024-04-12 11:09:19.76335  | Collected plants in Russia in 1984.
 21352211 | I. Pshenichnaya            | Elena Taboko Taku      | 2024-04-12 11:02:45.327919 | Collected plants in Russia in 1985.
 21352210 | G. D. Dymina               | Elena Taboko Taku      | 2024-04-12 10:54:17.99777  | Collected plants in Russia in 1979.
 21352209 | O. Zhdanova                | Elena Taboko Taku      | 2024-04-12 10:52:07.066693 | Collected plants in Russia in 1986.
 21352204 | O. Feronova                | Elena Taboko Taku      | 2024-04-11 18:26:41.110518 | Collected plants in Russia in 1979
 21352202 | V. Rozhitsina              | Elena Taboko Taku      | 2024-04-11 17:59:02.128203 | Collected plants in Russia in 1978
 21352177 | O. Babarykina              | Elena Taboko Taku      | 2024-04-11 00:23:58.065582 | Collected plants in Russia in 1984
 21352176 | I. Pshenichnaya            | Elena Taboko Taku      | 2024-04-11 00:19:57.708599 | Collected plants in Russia in 1984
 21352175 | L. Dyukova                 | Elena Taboko Taku      | 2024-04-11 00:15:46.461338 | Collected plants in Russia in 1985
 21352174 | L. Mironova                | Elena Taboko Taku      | 2024-04-11 00:09:32.275383 | Collected plants in Russia in 1985
 21352168 | A. Vershinin               | Elena Taboko Taku      | 2024-04-10 14:19:07.197052 | Collected plants in Russia in 1981
 21352166 | A. Andreeva                | Elena Taboko Taku      | 2024-04-10 14:10:44.58855  | Collected plants in Russia in 1980
 21352164 | G. Yakovleva               | Elena Taboko Taku      | 2024-04-10 13:55:12.067619 | Collected plants in Russia in 1976
 21352163 | V. Shein                   | Elena Taboko Taku      | 2024-04-10 13:52:27.865396 | Collected plants in Russia in 1980
 21352162 | S. Borisenko               | Elena Taboko Taku      | 2024-04-10 13:50:53.240816 | Collected plants in Russia in 1980
 21352160 | T. Fedorova                | Elena Taboko Taku      | 2024-04-10 13:49:23.075608 | Collected plants in Russia in 1978
 21352158 | V. Golovanova              | Elena Taboko Taku      | 2024-04-10 13:44:53.973037 | Collected plants in Russia in 1980
 21352157 | N. Sidorenko               | Elena Taboko Taku      | 2024-04-10 13:36:57.756534 | Collected plants in Russia in 1980
 21352154 | A. Maneev                  | Elena Taboko Taku      | 2024-04-10 11:09:13.600432 | Collected plants in Russia in 1982
 21352152 | B. O'Donnell               | Searra Schell          | 2024-04-09 18:25:02.1307   | Collected in Katmai in 1993
 21352135 | N. Tumanova                | C. O. Webb             | 2024-04-05 17:41:31.472675 | Collector of Far Eastern Russia plants in 1971
 21352134 | V. Schvydkaya              | C. O. Webb             | 2024-04-05 17:39:51.876208 | Collector of Far Eastern Russia plants in 1985
 21352023 | G. Schelkovnikova          | C. O. Webb             | 2024-03-22 19:50:41.124828 | Collected plants with S.S Kharkevich in 1974
 21351967 | P. Hafker                  | Alison Whiting         | 2024-03-14 12:55:03.462628 | NEON - Utah project
 21351965 | Ian Pearse                 | C. O. Webb             | 2024-03-14 12:19:59.433642 | Volunteer for AKNHP in 2002
 21351962 | J. Batchelor               | Alison Whiting         | 2024-03-14 10:14:43.730232 | NEON - Utah project
 21351921 | Michael R. Howard          | Jessica K. Tir         | 2024-03-07 10:47:18.654563 | Collected plants and herps for the Walla Walla College collection (plants now housed at WSU Owenby Herbarium)
 21351920 | Lauren Baur                | Mariel L. Campbell     | 2024-03-06 19:46:45.630318 | Sevilleta LTER Research Scientist and Program Manager 2017-2023
 21351920 | Lauren Baur                | Mariel L. Campbell     | 2024-03-06 19:46:45.630318 | Sevilleta LTER Research Scientist and Program Manager
 21351913 | Harold Stowell             | Brooke Bogan           | 2024-03-06 12:27:29.169484 | Professor of Geological Sciences, The University of Alabama
 21351909 | Joann Stoddard             | Alison Whiting         | 2024-03-06 11:08:39.336368 | rehabilitates wildlife
 21351885 | B. Fraser                  | Derek S. Sikes         | 2024-03-05 13:38:39.472276 | Migrated from ALA database.
 21351881 | Nico Limon                 | Derek S. Sikes         | 2024-03-05 13:38:39.405224 | Healy, AK resident 2018
 21351875 | Bundtzen                   | Derek S. Sikes         | 2024-03-05 13:38:39.295728 | Invertebrate bulkload agent.
 21351872 | Schmidt                    | Derek S. Sikes         | 2024-03-05 13:38:39.262718 | C.E. Scmidt? G.A. Scmidt? Karl P. Scmidt? R. Schmidt?
 21351869 | P. Valkenberg              | Derek S. Sikes         | 2024-03-05 13:38:39.202204 | ALS volunteer Lepidoptera collector in AK
 21351868 | S. Weeks                   | Derek S. Sikes         | 2024-03-05 13:38:39.183082 | Bird collection agent.
 21351866 | B. Kelly                   | Derek S. Sikes         | 2024-03-05 13:38:39.13914  | Bird collection agent.
 21351835 | Don Carney                 | Derek S. Sikes         | 2024-03-05 13:38:29.950863 | Insect collector 1960
 21351834 | S. Kogl                    | Derek S. Sikes         | 2024-03-05 13:38:29.933996 | lepidoptera collector
 21351829 | M. Merrell                 | Derek S. Sikes         | 2024-03-05 13:38:29.842861 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351818 | Snarski                    | Derek S. Sikes         | 2024-03-05 13:38:29.660891 | Invertebrate bulkload agent. Possibly David J. Snarski?
 21351808 | W. S. McAlpine             | Derek S. Sikes         | 2024-03-05 13:38:29.46636  | collected Lepidoptera in Alaska
 21351805 | Jared Hughey               | Derek S. Sikes         | 2024-03-05 13:38:29.414319 | Bee collector in Alaska 2021
 21351801 | Mark Detterman             | Derek S. Sikes         | 2024-03-05 13:38:29.341203 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351798 | Buck                       | Derek S. Sikes         | 2024-03-05 13:38:29.300602 | Invertebrate bulkload agent.
 21351787 | Svendsen                   | Derek S. Sikes         | 2024-03-05 13:38:29.090336 | lepidoptera collector
 21351780 | Emily Blattmachr           | Derek S. Sikes         | 2024-03-05 13:38:28.987924 | Anchorage member of the public.
 21351778 | A. Bakke                   | Derek S. Sikes         | 2024-03-05 13:38:28.944342 | ALS volunteer Lepidoptera collector in AK
 21351772 | Olive Kanayurak            | Derek S. Sikes         | 2024-03-05 13:38:28.830954 | Barrow, AK resident 2021
 21351764 | Liz Masi                   | Derek S. Sikes         | 2024-03-05 13:38:28.699816 | collected Lepidoptera in AK
 21351751 | Chelonia Jones             | Derek S. Sikes         | 2024-03-05 13:38:28.435787 | Bee collector in Alaska 2021
 21351747 | K. Bevernitz               | Derek S. Sikes         | 2024-03-05 13:38:28.364936 | ALS volunteer Lepidoptera collector in AK
 21351740 | William A. Lehnhausen      | Derek S. Sikes         | 2024-03-05 13:38:28.24093  | Bird collection agent.
 21351724 | Paul Spitzer               | Derek S. Sikes         | 2024-03-05 13:38:27.955857 | Collector for Alaska Lepidoptera Survey
 21351714 | A. D. Robertson            | Derek S. Sikes         | 2024-03-05 13:38:27.771993 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351711 | Jon Berrie                 | Derek S. Sikes         | 2024-03-05 13:38:27.72166  | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351709 | Aliza Segal                | Derek S. Sikes         | 2024-03-05 13:38:27.680154 | Bee collector in Alaska 2021
 21351706 | Ginger Scoggin             | Derek S. Sikes         | 2024-03-05 13:38:27.627148 | PhD, DNP, ANP-C, Anchorage, AK, 2016
 21351700 | C. Cattell                 | Derek S. Sikes         | 2024-03-05 13:38:27.525267 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351691 | Vicky Koelzer              | Derek S. Sikes         | 2024-03-05 13:38:27.370989 | Gardener in Copper Basin, AK
 21351688 | L. Heimer                  | Derek S. Sikes         | 2024-03-05 13:38:27.312803 | ALS volunteer Lepidoptera collector in AK
 21351684 | Eric Pyne                  | Derek S. Sikes         | 2024-03-05 13:38:27.25229  | KW Philip ALS collector.
 21351680 | P. Redwood                 | Derek S. Sikes         | 2024-03-05 13:38:27.180336 | ALS volunteer Lepidoptera collector in AK
 21351675 | W. Arvey                   | Derek S. Sikes         | 2024-03-05 13:38:27.092862 | collected Lepidoptera in Alaska
 21351672 | Don Bee                    | Derek S. Sikes         | 2024-03-05 13:38:27.026079 | Collected fish with Cal Skaugstad in 1985
 21351667 | P. Merritt                 | Derek S. Sikes         | 2024-03-05 13:38:26.934564 | Insect collector Alaska
 21351659 | Terri Cole                 | Derek S. Sikes         | 2024-03-05 13:38:26.775315 | Collected in DNP, possible with Pat Pyne
 21351656 | J. Bente                   | Derek S. Sikes         | 2024-03-05 13:38:26.714895 | Lepidoptera collector
 21351643 | A. L. Sanchez              | Derek S. Sikes         | 2024-03-05 13:38:26.456186 | Bulkloaded MSB Mammal agent.
 21351640 | L. Halpin                  | Derek S. Sikes         | 2024-03-05 13:38:26.401436 | ALS volunteer Lepidoptera collector in AK
 21351636 | Bethany Walker             | Derek S. Sikes         | 2024-03-05 13:38:26.332221 | 2019 UAF Bug Camper, Fairbanks, AK
 21351631 | Eric Castro                | Derek S. Sikes         | 2024-03-05 13:38:26.250353 | Bee collector in Alaska 2021
 21351630 | T. Hudson                  | Derek S. Sikes         | 2024-03-05 13:38:26.233311 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351615 | M. Jetton                  | Derek S. Sikes         | 2024-03-05 13:38:25.973261 | ALS volunteer Lepidoptera collector in AK
 21351612 | T. Dickel                  | Derek S. Sikes         | 2024-03-05 13:38:25.926398 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351609 | Lyle Krichen               | Derek S. Sikes         | 2024-03-05 13:38:25.881901 | trapper in Cordova
 21351603 | H. Tagarook                | Derek S. Sikes         | 2024-03-05 13:38:25.764963 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351601 | L. F. Elliott              | Derek S. Sikes         | 2024-03-05 13:38:25.718377 | Associated with R. Wilk
 21351599 | Meekin                     | Derek S. Sikes         | 2024-03-05 13:38:25.66781  | Lepidoptera collector
 21351588 | Blakeslee                  | Derek S. Sikes         | 2024-03-05 13:38:20.529297 | First initials unknown, collected for Alaska Lepidoptera Survey in 1975
 21351582 | G. Cranna                  | Derek S. Sikes         | 2024-03-05 13:38:20.42808  | ALS volunteer Lepidoptera collector in AK
 21351580 | Terri Wild                 | Derek S. Sikes         | 2024-03-05 13:38:20.391425 | field technician Seward Peninsula, AK 2013
 21351579 | Skyler Jordan              | Derek S. Sikes         | 2024-03-05 13:38:20.373707 | field technician Seward Peninsula, AK 2013 [assumed to be same as Skyler C. Jordan, bee collector in Alaska 2021]
 21351569 | Jeff Foley                 | Derek S. Sikes         | 2024-03-05 13:38:20.218304 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351567 | J. Trent                   | Derek S. Sikes         | 2024-03-05 13:38:20.189499 | KWP, butterfly collector
 21351561 | Teri C. Wild               | Derek S. Sikes         | 2024-03-05 13:38:20.087479 | Bee collector in Alaska 2021
 21351560 | Walter T. Phillips         | Derek S. Sikes         | 2024-03-05 13:38:20.069083 | collected Lepidoptera in Alaska
 21351555 | Sue Quinlan                | Derek S. Sikes         | 2024-03-05 13:38:19.988178 | Migrated from ALA database.
 21351554 | B. Robertson               | Derek S. Sikes         | 2024-03-05 13:38:19.966345 | New Zealand collector on beetles 2002
 21351552 | T. Ward                    | Derek S. Sikes         | 2024-03-05 13:38:19.933408 | ALS volunteer Lepidoptera collector in AK
 21351542 | Bio 116 students           | Derek S. Sikes         | 2024-03-05 13:38:19.790886 | former agent type group
 21351541 | Lindsey Taylor             | Derek S. Sikes         | 2024-03-05 13:38:19.776148 | Bee collector in Alaska 2021
 21351514 | Anna-Marie Kokx            | Derek S. Sikes         | 2024-03-05 13:38:19.342074 | collector
 21351503 | Mary A. Calmes             | Derek S. Sikes         | 2024-03-05 13:38:19.151287 | Migrated from ALA database.
 21351498 | C. A. Pease                | Derek S. Sikes         | 2024-03-05 13:38:19.036977 | Charles A. Pease?
 21351495 | Karen Henderson            | Derek S. Sikes         | 2024-03-05 13:38:18.8914   | Collector for Alaska Lepidoptera Survey
 21351478 | Unalakleet School students | Derek S. Sikes         | 2024-03-05 13:38:18.601517 | former agent type group
 21351473 | G. Nielsen                 | Derek S. Sikes         | 2024-03-05 13:38:18.51093  | collected Lepidoptera for Alaska Lepidoptera Survey
 21351471 | M. Shepherd                | Derek S. Sikes         | 2024-03-05 13:38:18.380863 | Miss Margaret Shepherd?
 21351469 | G. Kunkle                  | Derek S. Sikes         | 2024-03-05 13:38:18.354872 | ALS volunteer Lepidoptera collector in AK
 21351467 | D. M. Olsen                | Derek S. Sikes         | 2024-03-05 13:38:18.315796 | D.M. Olson?
 21351465 | P. Bente                   | Derek S. Sikes         | 2024-03-05 13:38:18.284953 | ALS volunteer Lepidoptera collector in AK
 21351462 | V. Waldron                 | Derek S. Sikes         | 2024-03-05 13:38:18.230611 | Lepidoptera collector
 21351461 | Gene Darby                 | Derek S. Sikes         | 2024-03-05 13:38:18.213304 | citizen, Kenai, AK
 21351459 | R. Fuson                   | Derek S. Sikes         | 2024-03-05 13:38:18.175991 | Lepidoptera collector
 21351457 | J. Gorham                  | Derek S. Sikes         | 2024-03-05 13:38:18.128328 | collected for Alaska Lepidoptera Survey
 21351449 | R. Mackey                  | Derek S. Sikes         | 2024-03-05 13:38:17.973638 | Bulkloaded MSB Mammal agent.
 21351446 | F. Karpuleon               | Derek S. Sikes         | 2024-03-05 13:38:17.929703 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351445 | J. Jacobs                  | Derek S. Sikes         | 2024-03-05 13:38:17.907213 | J.F. Jacobs? J.W. Jacobs? Jeremy Jacobs?
 21351432 | T. True                    | Derek S. Sikes         | 2024-03-05 13:38:17.703254 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351417 | J. Perkins                 | Derek S. Sikes         | 2024-03-05 13:38:17.487678 | Lepidoptera collector in AK
 21351410 | Hollingsworth              | Derek S. Sikes         | 2024-03-05 13:38:17.379134 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351408 | Ruby An                    | Derek S. Sikes         | 2024-03-05 13:38:17.338636 | PhD student working at Toolik Field Station Alaska, 2022
 21351405 | William Dade               | Derek S. Sikes         | 2024-03-05 13:38:17.28883  | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351401 | Tony Bakos                 | Derek S. Sikes         | 2024-03-05 13:38:17.226548 | Bee collector in Alaska 2021
 21351400 | K. Bury                    | Derek S. Sikes         | 2024-03-05 13:38:17.202589 | Lepidoptera collector
 21351383 | M. Macgow                  | Derek S. Sikes         | 2024-03-05 13:38:16.920149 | Or could be M. Maogow
 21351381 | Roberts                    | Derek S. Sikes         | 2024-03-05 13:38:16.894943 | ALS volunteer Lepidoptera collector in AK
 21351380 | Komarek                    | Derek S. Sikes         | 2024-03-05 13:38:16.880916 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351379 | Charles Fahl               | Derek S. Sikes         | 2024-03-05 13:38:16.864203 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351373 | E. Hibler                  | Derek S. Sikes         | 2024-03-05 13:38:16.764772 | ALS volunteer Lepidoptera collector in AK
 21351372 | E. Halpin                  | Derek S. Sikes         | 2024-03-05 13:38:16.745089 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351361 | N. Threlkeld               | Derek S. Sikes         | 2024-03-05 13:38:16.565538 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351360 | Christina Trimingham       | Derek S. Sikes         | 2024-03-05 13:38:16.548931 | Bee collector in Alaska 2021
 21351358 | Mitchell A. Parsons        | Derek S. Sikes         | 2024-03-05 13:38:16.515226 | Bee collector in Alaska 2021
 21351347 | Stream Ecology Class UAF   | Derek S. Sikes         | 2024-03-05 13:38:16.3071   | Probably from University of Alaska Fairbanks
 21351346 | K. Flaccus                 | Derek S. Sikes         | 2024-03-05 13:38:16.285121 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351344 | B. Wood                    | Derek S. Sikes         | 2024-03-05 13:38:16.250742 | Insect Collector Alaska Caribou Creek Research
 21351342 | M. J. Kennedy-Smith        | Derek S. Sikes         | 2024-03-05 13:37:47.954922 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351341 | Mitch Parsons              | Derek S. Sikes         | 2024-03-05 13:37:47.937632 | field technician Seward Peninsula, AK 2013
 21351337 | Edward Szafran             | Derek S. Sikes         | 2024-03-05 13:37:47.870091 | ALS volunteer Lepidoptera collector in AK
 21351323 | T. Ovenshine               | Derek S. Sikes         | 2024-03-05 13:37:47.62894  | collected Lepidoptera for Alaska Lepidoptera Survey
 21351322 | K. Wilk                    | Derek S. Sikes         | 2024-03-05 13:37:47.612703 | Associated with R. Wilk
 21351319 | Natalie Konig              | Derek S. Sikes         | 2024-03-05 13:37:47.56063  | Bee collector in Alaska 2021
 21351317 | S. Temple                  | Derek S. Sikes         | 2024-03-05 13:37:47.529654 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351315 | Alivia Gonzalez            | Derek S. Sikes         | 2024-03-05 13:37:47.4981   | 2019 UAF Bug Camper, Fairbanks, AK
 21351298 | Don Richter                | Derek S. Sikes         | 2024-03-05 13:37:47.206773 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351290 | Cassandra Brown            | Derek S. Sikes         | 2024-03-05 13:36:54.974288 | Bee collector in Alaska 2021
 21351289 | J. Campbell                | Derek S. Sikes         | 2024-03-05 13:36:54.949897 | Bird collection agent.
 21351281 | L. Jennings                | Derek S. Sikes         | 2024-03-05 13:34:13.470231 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351275 | S. Campbell                | Derek S. Sikes         | 2024-03-05 13:34:13.395198 | Migrated from ALA database.
 21351267 | John D. Mertz              | Angela Linn            | 2024-03-04 18:41:16.341581 | UAM Ethnology and History
 21351257 | Dennis L. Shirley          | Alison Whiting         | 2024-03-02 16:51:25.487017 | retired UDWR employee
 21351254 | David Remple               | Alison Whiting         | 2024-03-02 16:34:08.904522 | Falconer
 21351252 | D. A. Fiedler              | Alison Whiting         | 2024-03-02 10:58:52.058268 | Masters in 1974 - St. Cloud State College
 21351175 | Ross M. Anderson           | Alison Whiting         | 2024-02-24 17:22:45.498319 | Ross lives in Sandy and raises parrots for a hobby.  He was the Hogle Zoo veterinarian for many years.
 21351155 | William R. Fraser          | Alison Whiting         | 2024-02-21 16:03:41.320016 | CEO of Polar Oceans Research Group
 21350937 | Natalie Lucero             | J. Tomasz Giermakowski | 2024-01-30 14:24:32.796167 | MSB:Herp
 21350915 | Matthew Campen             | Mariel L. Campbell     | 2024-01-26 17:11:31.136402 | UNMHS School of Pharmacy
 21350848 | Anna Brant                 | Katherine L. Anderson  | 2024-01-13 13:15:23.046587 | University of Washington graduate student
DerekSikes commented 2 months ago

This is why it is key to make remarks visible within the agent pick tool. Seeing remarks will help disambiguate agent picks.

dustymc commented 2 months ago

Seeing remarks will help disambiguate agent picks.

For the 1% (probable wild overestimation!) of agents who read (and understand - "ALS"==Amyotrophic Lateral Sclerosis, right?) them, maybe.

Putting those data where they belong makes them available to a much wider audience.

(And https://github.com/ArctosDB/arctos/issues/7434 is the place to change how the pick works, but try it in test first.)

dustymc commented 2 months ago

Dups

 agent_id | agent_type | preferred_agent_name |       creator       |        created_date        
----------+------------+----------------------+---------------------+----------------------------
 21351197 | person     | David Johnson        | Derek S. Sikes      | 2024-02-26 19:05:23.736418
 21295057 | person     | David Johnson        | Dusty L. McDonald   | 2015-10-06 11:30:13
 21352394 | person     | Elyssa Bush          | Kara Branchflower   | 2024-04-23 10:49:00.568389
 21352393 | person     | Elyssa Bush          | Kara Branchflower   | 2024-04-23 10:46:02.158056
 21352211 | person     | I. Pshenichnaya      | Elena Taboko Taku   | 2024-04-12 11:02:45.327919
 21352176 | person     | I. Pshenichnaya      | Elena Taboko Taku   | 2024-04-11 00:19:57.708599
 21351445 | person     | J. Jacobs            | Derek S. Sikes      | 2024-03-05 13:38:17.907213
 21350919 | person     | J. Jacobs            | Jessica Weller      | 2024-01-27 11:25:38.328081
 21333621 | person     | Lauren Wilson        | Zack Perry          | 2021-07-13 12:39:08.99178
 21300714 | person     | Lauren Wilson        | Erica Krimmel       | 2016-04-12 12:17:17
 21332078 | person     | Linda Moore          | Jonathan L. Dunnum  | 2021-03-16 13:19:03.703952
 21330095 | person     | Linda Moore          | Andrew Charles Doll | 2020-07-27 12:41:15.857541
 21352220 | person     | L. Mironova          | Elena Taboko Taku   | 2024-04-12 11:50:41.997135
 21352174 | person     | L. Mironova          | Elena Taboko Taku   | 2024-04-11 00:09:32.275383
 21352213 | person     | O. Babarykina        | Elena Taboko Taku   | 2024-04-12 11:13:00.603817
 21352177 | person     | O. Babarykina        | Elena Taboko Taku   | 2024-04-11 00:23:58.065582
 21352417 | person     | Tim Wheeler          | Mariel L. Campbell  | 2024-04-24 13:52:16.752771
 21282782 | person     | Tim Wheeler          | Jordan Metzgar      | 2014-10-22 15:45:14

didn't click the magic button

 agent_id | preferred_agent_name |    creator    |        created_date        
----------+----------------------+---------------+----------------------------
 21352408 | Mr. George Laflin    | Olivia Cimino | 2024-04-24 10:30:11.712287
Jegelewicz commented 2 months ago

I think some of these are due to lags in the cache. I have added agents and for a while, search won't find them except when I search the EXACT preferred name. I can see how people would add someone again....this needs training.

Jegelewicz commented 2 months ago

didn't click the magic button

what magic button?

Jegelewicz commented 2 months ago

one of the David Johnson carries the not the same as relationship

Jegelewicz commented 2 months ago

one of the J. Jacob carries the not the same as relationship

Jegelewicz commented 2 months ago

one of the Linda Moore carries the not the same as relationship

Jegelewicz commented 2 months ago

@camwebb I think some of these may be your students? See https://github.com/ArctosDB/arctos/issues/7649#issuecomment-2076031141

Jegelewicz commented 2 months ago

Dupes addressed

dustymc commented 2 months ago

lags in the cache

I did write the possibility of changing the cache time (including to zero) into the API, so there's no technical trick in allowing (defaulting, whatever) "us" an override. That could also be used to melt the server, so guidance needed.

needs training

Clearly.

what magic button?

Take your pick, there's one that'll go from....

Screenshot 2024-04-25 at 07 13 56

to

Screenshot 2024-04-25 at 07 14 06

and one that'll go from....

Screenshot 2024-04-25 at 07 14 24

to

Screenshot 2024-04-25 at 07 14 30

not the same as relationship

I can rewrite the scripts, but I think we're up to one example (which I of course can't find) of those being based on any sort of evidence, ignoring that in data meant for human review seems right. https://github.com/ArctosDB/arctos/issues/7719 supports this, I think - creating the duplicate seems defensible, but I can't understand why the relationship was added. I'd still get rid of it if I could....

DerekSikes commented 2 months ago

I just made an agent: Tom Rickman

this was after a research associate wrote: "when I try to enter the collector, Tom Rickman, I tried to create the person as before but it errors out no matter what. Tom is alive. It's really quirky. If I try to add any additional info on him then it just errors out. If I don't enter any of the fields I get the option to force create and then it errors out. "

I was able to create Tom Rickman but when I did i got a page that showed a unintelligible short list of two Tom Rickmans and a force create button - why two? I had searched and no one existed with that name?

I force created and it worked. Why did it not work for my research associate, Joey Slowik?

And then I told Joey I made the agent but he wrote back:

"Well I got the boxes to turn green but the errors still exist anywhere I put Tom's name. Ideas?

2024-4-25T10:45:30: FAIL: agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents: {"message":"agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents","status":"fail"}"

So I'd say this make new agents process is broken in many ways.

dustymc commented 2 months ago
arctos-> order by preferred_agent_name,created_date desc;
 agent_id |  agent_type  | preferred_agent_name |       creator       |        created_date        
----------+--------------+----------------------+---------------------+----------------------------
 21351197 | person       | David Johnson        | Derek S. Sikes      | 2024-02-26 19:05:23.736418
 21295057 | person       | David Johnson        | Dusty L. McDonald   | 2015-10-06 11:30:13
 21352459 | organization | Diamond Superior     | Angela Linn         | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior     | Angela Linn         | 2024-04-29 14:17:46.457997
 21351445 | person       | J. Jacobs            | Derek S. Sikes      | 2024-03-05 13:38:17.907213
 21350919 | person       | J. Jacobs            | Jessica Weller      | 2024-01-27 11:25:38.328081
 21333621 | person       | Lauren Wilson        | Zack Perry          | 2021-07-13 12:39:08.99178
 21300714 | person       | Lauren Wilson        | Erica Krimmel       | 2016-04-12 12:17:17
 21332078 | person       | Linda Moore          | Jonathan L. Dunnum  | 2021-03-16 13:19:03.703952
 21330095 | person       | Linda Moore          | Andrew Charles Doll | 2020-07-27 12:41:15.857541
 21352495 | person       | P. V. Nesterov       | Elena Taboko Taku   | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov       | unknown             | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler          | Mariel L. Campbell  | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler          | Jordan Metzgar      | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman          | Derek S. Sikes      | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman          | Jozef A. Slowik     | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman          | Jozef A. Slowik     | 2024-04-25 12:06:48.200331
(17 rows)
DerekSikes commented 2 months ago

Can you restrict these searches to exclude those with relationship 'not the same as' please?

dustymc commented 2 months ago

Cool stuff in remarks

 agent_id | preferred_agent_name |      creator      |        created_date        |                        attribute_value                        
----------+----------------------+-------------------+----------------------------+---------------------------------------------------------------
 21352576 | Sally Whetstone      | Elena Taboko Taku | 2024-05-08 00:11:27.394131 | Collected plants in Alaska in 1981.
 21352575 | Howard Ulrich        | Elena Taboko Taku | 2024-05-08 00:07:50.725251 | Collected plants in Alaska in 1985.
 21352574 | Richard G. Holoway   | Elena Taboko Taku | 2024-05-08 00:04:12.008903 | Collected plants in Alaska in 1979.
 21352573 | N. D. Atwod          | Elena Taboko Taku | 2024-05-07 23:59:59.088798 | Collected plants in United States, Arizona in 1973.
 21352572 | B. Mitchell          | Elena Taboko Taku | 2024-05-07 23:56:33.364086 | Collected plants in Canada in 1978.
 21352571 | K. Paige             | Elena Taboko Taku | 2024-05-07 23:53:06.191529 | Collected plants in Canada in 2001.
 21352570 | J. L. Penny          | Elena Taboko Taku | 2024-05-07 23:51:20.805321 | Collected plants in Canada in 2001.
 21352569 | J. Miner             | Elena Taboko Taku | 2024-05-07 23:46:32.015662 | Collected plants in Alaska in 2008.
 21352557 | Alfonso Doucette     | Elena Taboko Taku | 2024-05-06 23:44:27.330045 | Collected plants in Alaska in 2007.
 21352554 | Giovana D'Angelo     | Mingna Zhuang     | 2024-05-06 10:04:32.123804 | graduate student of entomology
 21352548 | N. Fedorova          | Elena Taboko Taku | 2024-05-05 15:09:30.370723 | Collected plants in Turkmenistan in 1940 with Al. A. Fedorov.
 21352546 | J. Ryder             | Elena Taboko Taku | 2024-05-05 00:39:57.671707 | Collected in Canada with Bruce A. Bennett in 2005.
 21352545 | J. Line              | Elena Taboko Taku | 2024-05-05 00:36:03.159676 | Collected plants in Canada with Bruce A. Bennett in 2005.
 21352544 | O. Ceska             | Elena Taboko Taku | 2024-05-05 00:28:20.448506 | Collected plants in Canada in 2002.
 21352543 | C. S. Tomb           | Elena Taboko Taku | 2024-05-04 00:13:29.460839 | Collected plants in Russia in 1978.
 21352542 | J. B. McCarthy       | Elena Taboko Taku | 2024-05-04 00:06:23.235148 | Collected plants in Alaska in 1984.
 21352540 | Ann Odasz            | Elena Taboko Taku | 2024-05-03 23:36:08.575995 | Collected plants in United States, Wyoming in 1979.
 21352538 | J. Alto              | Elena Taboko Taku | 2024-05-03 13:40:47.843067 | Collected plants in Alaska in 1985.
 21352537 | Anders Michelsen     | Elena Taboko Taku | 2024-05-03 13:38:37.474309 | Collected plants in Greenland in 1984.
 21352536 | Helle Byrge          | Elena Taboko Taku | 2024-05-03 13:37:05.162128 | Collected plants in Greenland in 1984.
 21352535 | L. S. Dick           | Elena Taboko Taku | 2024-05-03 13:33:12.549559 | Collected plants in Alaska in 1970.
 21352534 | F. LeBlanc           | Elena Taboko Taku | 2024-05-03 13:29:41.051965 | Collected plants in Canada in 1961.
 21352533 | F. Ernest            | Elena Taboko Taku | 2024-05-03 13:23:07.398095 | Collected plants in Canada in 1961.
 21352532 | L. Duhamel           | Elena Taboko Taku | 2024-05-03 13:21:15.439678 | Collected plants in Canada in 1961.
 21352529 | L. M. Zudova         | Elena Taboko Taku | 2024-05-03 12:54:46.906752 | Collected plants in Russia in 1971.
 21352518 | V. N. Kononov        | Elena Taboko Taku | 2024-05-02 00:43:32.724689 | Collected plants in Moldova in 1949 -1958.
 21352515 | Marcie E. Mondt      | Alyssa Semerdjian | 2024-05-01 14:32:22.871172 | Cal Poly Humboldt student in early 1990s
DerekSikes commented 2 months ago

I just searched on Tom Rickman who has 3 agent IDs in your prior list and yet only 1 agent is found from my search. What's up? I expected to find 3.

dustymc commented 2 months ago

temp_dead.csv

dustymc commented 2 months ago

What's up?

Bug - thanks, I'll squash it for next release.

dustymc commented 2 months ago

relationship

Here are dups which don't have any relationships. (Which could involve the worst possible scenario: A well-known agent isn't getting proper attribution because there's a duplicate, and now I'm ignoring that in the reports-or-whatever because one of them has some data. That'll need careful handling if this goes anywhere.)

On the subject of going somewhere, please see https://github.com/ArctosDB/arctos/issues/7649#issue-2232072678 - my goal here isn't to clean up little bits and pieces, it's to develop policy regarding the extent to which low-quality data should be an "Arctos problem," and what "low quality" means plus what I can do about it if we do want to establish any sort of standards/suggestions/procedures/reports/whatever.

 agent_id |  agent_type  | preferred_agent_name |      creator       |        created_date        
----------+--------------+----------------------+--------------------+----------------------------
 21352459 | organization | Diamond Superior     | Angela Linn        | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior     | Angela Linn        | 2024-04-29 14:17:46.457997
 21352495 | person       | P. V. Nesterov       | Elena Taboko Taku  | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov       | unknown            | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler          | Mariel L. Campbell | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler          | Jordan Metzgar     | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman          | Derek S. Sikes     | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman          | Jozef A. Slowik    | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman          | Jozef A. Slowik    | 2024-04-25 12:06:48.200331
(9 rows)
dustymc commented 2 months ago

The aforementioned bug involved person-agents with no first or last name. Note that the create form will suggest these values with one click.


 agent_id | preferred_agent_name |     creator     |        created_date        
----------+----------------------+-----------------+----------------------------
 21352432 | Tom Rickman          | Jozef A. Slowik | 2024-04-25 12:09:25.013787
 21352431 | Tom Rickman          | Jozef A. Slowik | 2024-04-25 12:06:48.200331
 21352408 | Mr. George Laflin    | Olivia Cimino   | 2024-04-24 10:30:11.712287
 21352122 | Jack Spratt          | Jozef A. Slowik | 2024-04-05 12:12:44.141425
 21352119 | C. Stillman          | Jozef A. Slowik | 2024-04-04 16:05:10.937667
 21352117 | B. S. Blitz          | Jozef A. Slowik | 2024-04-04 15:34:50.688057
 21352116 | F. Sorensen          | Jozef A. Slowik | 2024-04-04 15:25:19.630444
 21352109 | M. Rosy              | Jozef A. Slowik | 2024-04-04 12:59:13.243463
dustymc commented 2 months ago

From error logs: https://arctos.database.museum/agent.cfm?agent_name=leah%25barr

Jegelewicz commented 2 months ago

From error logs: https://arctos.database.museum/agent.cfm?agent_name=leah%25barr

Cleaned up, but as long as untrained students are doing this, I expect it to happen daily....

dustymc commented 2 months ago

untrained students

Setting policy on that sort of thing seems like something Arctos (the community, not the techy-bits) could do.

dustymc commented 2 months ago

Relationship-free duplicate-ish:


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352459 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:17:46.457997
 21331498 | person       | Guillermo D’Elía       | James L. Patton          | 2020-12-13 16:25:11.485838
 21247708 | person       | Guillermo D'Elía       | unknown                  | 2013-12-16 21:49:31
 21280994 | person       | Jack DeVille           | Dusty L. McDonald        | 2014-04-30 14:23:48
 21279626 | person       | Jack De Ville          | Dusty L. McDonald        | 2014-04-30 14:01:26
 21352605 | person       | Jorge Galindo Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:02:13.260878
 21352604 | person       | Jorge Galindo-Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:00:41.018711
 21334283 | person       | J. O. Sullivan         | Teresa J. Mayfield-Meyer | 2021-09-07 16:29:24.897153
     7604 | person       | J. O'Sullivan          | unknown                  | 2013-12-16 21:49:31
  1017329 | person       | LaRue                  | unknown                  | 2013-12-16 21:49:31
 21253481 | person       | La Rue                 | unknown                  | 2013-12-16 21:49:31
  1011480 | person       | L. VanHorn             | unknown                  | 2013-12-16 21:49:31
  1010287 | person       | L. Van Horn            | unknown                  | 2013-12-16 21:49:31
 21253957 | person       | Mary O’Donnel          | unknown                  | 2013-12-16 21:49:31
 21256873 | person       | Mary O'Donnel          | unknown                  | 2013-12-16 21:49:31
 21286922 | person       | Norma Le Veque         | Dusty L. McDonald        | 2014-11-07 12:22:41
 21285505 | person       | Norma LeVeque          | Dusty L. McDonald        | 2014-11-07 12:22:32
 21352495 | person       | P. V. Nesterov         | Elena Taboko Taku        | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov         | unknown                  | 2013-12-16 21:49:31
 21258795 | person       | Rößner                 | unknown                  | 2013-12-16 21:49:31
 21257004 | person       | Röner                  | unknown                  | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler            | Mariel L. Campbell       | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler            | Jordan Metzgar           | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman            | Derek S. Sikes           | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:06:48.200331
Jegelewicz commented 2 months ago

@DerekSikes can you give us any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352433 | person       | Tom Rickman            | Derek S. Sikes           | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:06:48.200331
Jegelewicz commented 2 months ago

@AJLinn any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352459 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:17:46.457997
Jegelewicz commented 2 months ago

@jldunnum any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352605 | person       | Jorge Galindo Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:02:13.260878
 21352604 | person       | Jorge Galindo-Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:00:41.018711
Jegelewicz commented 2 months ago

I am hopeful that we can figure out why these almost-immediate duplicates were made. Is there something going on that could help make adding agents better?

mkoo commented 2 months ago

Might be the feedback lag. Once you create or edit an agent, the cache is a serious instant-gratification/ acknowledment impediment. So seems to me multiples get made because the user receives no confirmation or feedback and thinks nothing has happened (so you do it again!) I have had this happened to me with editing profiles and adding in more data to agents.

I do think there's a technical solution to this whether UI/UX or maybe a backend resource reallocation to make this more realtime-y?

dustymc commented 2 months ago

lag

yes, I'm pretty sure that's part of it.

realtime-y

Week-ish done: https://github.com/ArctosDB/arctos/issues/7738 (and only maybe-one meltdown, not too bad!)

mkoo commented 2 months ago

lag

yes, I'm pretty sure that's part of it.

So maybe some nice obvious feedback banners, when someone hits buttons!? (You just made an agent! Edit SAVED!) Could be applied generally with button pressing action

realtime-y

Week-ish done: #7738 (and only maybe-one meltdown, not too bad!)

well one of these was made today; we could see if there's further suspicious duplicate agents made or maybe make more realtimey?

dustymc commented 2 months ago

more realtimey

Uhh - the spice must flow??

I think it is in real time, or as close as the speed of stuffing photons through a few hundred miles of fiber optics allows. If there's some way to make that not happen, I'd REALLY like to know about it. If there's some other bug, I'd also like to know about that. UI improvement ideas always welcome as well.

mkoo commented 2 months ago

haha, yea if that's the case no need for hallucinated realities (I thought i read that it was on a less than weekly update)!

then we are left with some UI/UX tweaks? I dont know if any other guardrails to actual creation is needed right now since it will put us right back where we were. I might need a realtime chat to spitball more ...

dustymc commented 2 months ago

weekly

GAK! No, it was something less than an hour!

guardrails to actual creation

One alternative possibility is reports as above (but maybe not with only Teresa fixing everything...).

chat

Yea, probably.

Also worth mentioning that the vast majority of agents are pretty good. There's one verified agent without additional information (University of Colorado by @ebraker), and ~700 of the ~2000 created in the last 6 months lack clarifying data (most of those are me pretending to be Derek recovering verbatim agents - many of the rest do have some information, it's just stuffed into remarks - report here ).

This should run in writeSQL if anyone's curious:


select
    agent.agent_id bare_id,
    'https://arctos.database.museum/agent/'||agent.agent_id agent_id,
    agent.preferred_agent_name,
    getPreferredAgentName(agent.created_by_agent_id) creator
from 
    agent
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in (
        select attribute_type from ctagent_attribute_type where purpose in ('address','identifier','relationship')
        union select 'event' attribute_type
    ) 
where 
    agent_attribute.attribute_id is null and
    created_date > current_timestamp - interval '6 months'
Jegelewicz commented 2 months ago

I have to say - doing this "merge" was a giant pain in the ... and took up way too much of my time. We need something a little more automated - at least for moving all of the attributes from the "bad duplicate" to the agent that will remain.

jldunnum commented 2 months ago

Yes when creating I initially got an error and it didn’t appear like it had created the agent.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Teresa Mayfield-Meyer @.> Sent: Thursday, May 9, 2024 2:42:30 PM To: ArctosDB/arctos @.> Cc: Jonathan Dunnum @.>; Mention @.> Subject: Re: [ArctosDB/arctos] Do we need agent guardrails? (was: funky agents) (Issue #7649)

[EXTERNAL]

@jldunnumhttps://github.com/jldunnum any idea why this happened?

agent_id | agent_type | preferred_agent_name | creator | created_date ----------+--------------+------------------------+--------------------------+---------------------------- 21352605 | person | Jorge Galindo Gonzalez | Jonathan L. Dunnum | 2024-05-09 13:02:13.260878 21352604 | person | Jorge Galindo-Gonzalez | Jonathan L. Dunnum | 2024-05-09 13:00:41.018711

— Reply to this email directly, view it on GitHubhttps://github.com/ArctosDB/arctos/issues/7649#issuecomment-2103389523, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AED2PA7RL775Q5S3KMA5GADZBPNTNAVCNFSM6AAAAABF5MPFP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBTGM4DSNJSGM. You are receiving this because you were mentioned.Message ID: @.***>

Jegelewicz commented 2 months ago

@jldunnum thanks - if that happens again can you get us a screenshot of the error?

jldunnum commented 2 months ago

Yup will do. It happened out of the loan creation screen. Needed to add an agent when creating a loan.


Jonathan L. Dunnum Ph.D. (he, him, his) Senior Collection Manager Division of Mammals, Museum of Southwestern Biology Research Assistant Professor (LAT) Department of Biology University of New Mexico Albuquerque, NM 87131 (505) 277-9262 Fax (505) 277-1351

Chair, Systematic Collections Committee, American Society of Mammalogists Latin American Fellowship Committee, ASM

MSB Mammals website: http://www.msb.unm.edu/mammals/index.html Facebook: http://www.facebook.com/MSBDivisionofMammals

Shipping Address: Museum of Southwestern Biology Division of Mammals University of New Mexico CERIA Bldg 83, Room 204 Albuquerque, NM 87131


From: Teresa Mayfield-Meyer @.> Sent: Friday, May 10, 2024 8:35 AM To: ArctosDB/arctos @.> Cc: Jonathan Dunnum @.>; Mention @.> Subject: Re: [ArctosDB/arctos] Do we need agent guardrails? (was: funky agents) (Issue #7649)

[EXTERNAL]

@jldunnumhttps://github.com/jldunnum thanks - if that happens again can you get us a screenshot of the error?

— Reply to this email directly, view it on GitHubhttps://github.com/ArctosDB/arctos/issues/7649#issuecomment-2104723072, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AED2PA6P56ZN2T55MQRPGILZBTLMPAVCNFSM6AAAAABF5MPFP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBUG4ZDGMBXGI. You are receiving this because you were mentioned.Message ID: @.***>

Jegelewicz commented 2 months ago

@dustymc see above - maybe a clue to the duplicates created recently?

Jegelewicz commented 2 months ago

Resolved https://github.com/ArctosDB/arctos/issues/7649#issuecomment-2103389523

Made Jorge Galindo Gonzalez a bad duplicate of Jorge Galindo-Gonzalez