isamplesorg / isamples_inabox

Provides functionality intermediate to a collection and central
0 stars 1 forks source link

Bug in OpenContext fetcher where we wipe the resolved_url #91

Closed dannymandel closed 2 years ago

dannymandel commented 2 years ago

There is a bug in the OpenContext updater that knocked out the resolved_url field causing one of the automated integration tests to fail.

select * from thing where id='http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320';
                                  id                                  |            tstamp             |        tcreated        | item_type | authority_id | related | log | resolved_url | resolved_status |           tresolved           | resolve_elapsed |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         resolved_content                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | resolved_media_type |   _id   
----------------------------------------------------------------------+-------------------------------+------------------------+-----------+--------------+---------+-----+--------------+-----------------+-------------------------------+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------
 http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320 | 2021-09-21 02:10:02.691716-07 | 2021-09-14 17:54:59-07 | sample    | OPENCONTEXT  | null    |     |              |             200 | 2021-11-02 02:10:51.387673-07 |                 | {"uri": "http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320", "label": "Tibia; Right", "updated": "2021-09-15T00:54:59Z", "latitude": -26.63117, "longitude": -65.76979, "published": "2007-01-01T00:00:00Z", "Contributor": [{"id": "http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767", "label": "Guillermo L Mengoni Goñalons"}, {"id": "http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f", "label": "Dolores C Elkin"}], "context uri": "http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8", "late bce/ce": 1990.0, "project uri": "http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850", "citation uri": "https://n2t.net/ark:/28722/k26t12w6h", "early bce/ce": 1950, "context label": "Argentina/Salta/Reference Lama guanicoe", "item category": "Animal Bone", "project label": "Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina", "Has taxonomic identifier": [{"id": "http://www.gbif.org/species/5220188", "label": "Lama guanicoe"}], "Has anatomical identification": [{"id": "http://purl.obolibrary.org/obo/UBERON_0000979", "label": "tibia"}]} | application/json    | 5828364

In particular the tresolved is 2021-11-02 02:10:51.387673-07, which is this morning, but the resolved_url is NULL. This causes the integration test to fail as it is asserting on resolved_url being non-NULL.

dannymandel commented 2 years ago

A couple questions:

(1) Why are we saving records that didn't change? Can we not do that? (2) Why are we tossing the resolved_url on the floor?

Here's the full log from the run this morning that broke it:

2021-11-02T02:10:03 isb_lib.core:INFO: Using database at: postgresql+psycopg2://isb_writer@localhost/isb_1
2021-11-02T02:10:03 isb_lib.core:INFO: Using solr at: http://localhost:8983/solr/isb_core_records/
2021-11-02T02:10:03 main:INFO: loadRecords: <sqlmodel.orm.session.Session object at 0x7fd4faa88dc0>
2021-11-02T02:10:03 isb_lib.opencontext_adapter:INFO: trying to hit https://opencontext.org/subjects-search/.json?add-attribute-uris=1&attributes=obo-foodon-00001303%2Coc-zoo-has-anat-id%2Ccidoc-crm-p2-has-type%2Ccidoc-crm-p45-consists-of%2Ccidoc-crm-p49i-is-former-or-current-keeper-of%2Ccidoc-crm-p55-has-current-location%2Cdc-terms-temporal%2Cdc-terms-creator%2Cdc-terms-contributor&prop=oc-gen-cat-sample-col%7C%7Coc-gen-cat-bio-subj-ecofact%7C%7Coc-gen-cat-object&response=metadata%2Curi-meta&sort=updated--desc
2021-11-02T02:10:05 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/aa307308-0493-40fe-90f5-338effc6f665
2021-11-02T02:10:05 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/1aa7f220-29e9-4860-8982-dd1d58c2a0b5
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/f9bfc564-9e51-4764-8719-0f0203e7a342
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/58e5d426-48ea-43ff-a1fe-fff7c835e114
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: records_in_page Record id: http://opencontext.org/subjects/21aad781-8357-41bd-a11d-f6893cd832fd
2021-11-02T02:10:06 isb_lib.opencontext_adapter:INFO: Iterated record with updated date 2021-09-15 00:54:58+00:00 earlier than previous max 2021-09-14 17:54:59-07:00. Update is complete.
2021-11-02T02:10:06 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/aa307308-0493-40fe-90f5-338effc6f665', 'citation uri': 'https://n2t.net/ark:/28722/k2m337966', 'label': 'Metacarpal; Right', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta/Reference Lama guanicoe', 'context uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}], 'Has taxonomic identifier': [{'id': 'http://www.gbif.org/species/5220188', 'label': 'Lama guanicoe'}], 'Has anatomical identification': [{'id': 'http://purl.obolibrary.org/obo/UBERON_0013587', 'label': 'fused metacarpal bones 3 and 4'}]}
2021-11-02T02:10:06 root:INFO: Already have http://opencontext.org/subjects/aa307308-0493-40fe-90f5-338effc6f665
2021-11-02T02:10:46 root:INFO: Just saved existing thing
2021-11-02T02:10:46 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/1aa7f220-29e9-4860-8982-dd1d58c2a0b5', 'citation uri': 'https://n2t.net/ark:/28722/k2bk1sp89', 'label': 'Metatarsal; Right', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta/Reference Lama guanicoe', 'context uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}], 'Has taxonomic identifier': [{'id': 'http://www.gbif.org/species/5220188', 'label': 'Lama guanicoe'}], 'Has anatomical identification': [{'id': 'http://purl.obolibrary.org/obo/UBERON_0013588', 'label': 'fused metatarsal bones 3 and 4'}]}
2021-11-02T02:10:47 root:INFO: Already have http://opencontext.org/subjects/1aa7f220-29e9-4860-8982-dd1d58c2a0b5
2021-11-02T02:10:48 root:INFO: Just saved existing thing
2021-11-02T02:10:48 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/f9bfc564-9e51-4764-8719-0f0203e7a342', 'citation uri': 'https://n2t.net/ark:/28722/k2vh5zj3n', 'label': 'Radioulna; Right', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta/Reference Lama guanicoe', 'context uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}], 'Has taxonomic identifier': [{'id': 'http://www.gbif.org/species/5220188', 'label': 'Lama guanicoe'}], 'Has anatomical identification': [{'id': 'http://purl.obolibrary.org/obo/UBERON_0006715', 'label': 'radio-ulna'}]}
2021-11-02T02:10:49 root:INFO: Already have http://opencontext.org/subjects/f9bfc564-9e51-4764-8719-0f0203e7a342
2021-11-02T02:10:49 root:INFO: Just saved existing thing
2021-11-02T02:10:49 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'citation uri': 'https://n2t.net/ark:/28722/k2fb5gm1b', 'label': 'Reference Lama guanicoe', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta', 'context uri': 'http://opencontext.org/subjects/c1de5821-cfc3-4eef-bbbb-fc83bfa5ed94', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}]}
2021-11-02T02:10:49 root:INFO: Already have http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8
2021-11-02T02:10:50 root:INFO: Just saved existing thing
2021-11-02T02:10:50 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/58e5d426-48ea-43ff-a1fe-fff7c835e114', 'citation uri': 'https://n2t.net/ark:/28722/k2417d019', 'label': 'Scapula; Right', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta/Reference Lama guanicoe', 'context uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}], 'Has taxonomic identifier': [{'id': 'http://www.gbif.org/species/5220188', 'label': 'Lama guanicoe'}], 'Has anatomical identification': [{'id': 'http://purl.obolibrary.org/obo/UBERON_0006849', 'label': 'scapula'}]}
2021-11-02T02:10:50 root:INFO: Already have http://opencontext.org/subjects/58e5d426-48ea-43ff-a1fe-fff7c835e114
2021-11-02T02:10:51 root:INFO: Just saved existing thing
2021-11-02T02:10:51 main:INFO: got next id from open context {'uri': 'http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320', 'citation uri': 'https://n2t.net/ark:/28722/k26t12w6h', 'label': 'Tibia; Right', 'project label': 'Guanaco (Lama guanicoe) osteometric data from an individual from Northwest Argentina', 'project uri': 'http://opencontext.org/projects/01860ee9-6fb6-4e71-8958-113fb852c850', 'context label': 'Argentina/Salta/Reference Lama guanicoe', 'context uri': 'http://opencontext.org/subjects/b2e0e631-1661-48b3-9f60-0928fe5bd8f8', 'latitude': -26.63117, 'longitude': -65.76979, 'early bce/ce': 1950, 'late bce/ce': 1990.0, 'item category': 'Animal Bone', 'published': '2007-01-01T00:00:00Z', 'updated': '2021-09-15T00:54:59Z', 'Contributor': [{'id': 'http://opencontext.org/persons/ad3215c1-6116-45f8-8cef-57dbf7aa9767', 'label': 'Guillermo L Mengoni Goñalons'}, {'id': 'http://opencontext.org/persons/517e2071-e8cc-461c-8d47-180ce9bd5c8f', 'label': 'Dolores C Elkin'}], 'Has taxonomic identifier': [{'id': 'http://www.gbif.org/species/5220188', 'label': 'Lama guanicoe'}], 'Has anatomical identification': [{'id': 'http://purl.obolibrary.org/obo/UBERON_0000979', 'label': 'tibia'}]}
2021-11-02T02:10:51 root:INFO: Already have http://opencontext.org/subjects/f5a17d69-7354-43e9-bf78-d415b6ee0320
2021-11-02T02:10:52 root:INFO: Just saved existing thing
2021-11-02T02:10:52 main:INFO: total num records 6
dannymandel commented 2 years ago

Fixed by https://github.com/isamplesorg/isamples_inabox/pull/96