Open yuanzhou opened 5 months ago
@sunset666 can you list the API calls made against entity-api and ingest-api in sequence?
From Sunset:
First one is a simple PUT call to entity API, endpoint /entities/<uuid>
json: {'status': 'submitted'}
There is a for cycle through all the uuids of an upload (for these particular runs one with 83 and another one with 102)
After that is complete, on another step (not immediately after). It is a GET call to the same endpoint, for the same list of uuids
2/16/2024, tested on PROD using an existing IEC Testing dataset d0ebafcbe06c9fa2cb739c4a860e79bd
with "New" status, changed to "Submitted" and back to "New". Can NOT reproduce, and each status update deleted the old cache.
[2024-02-16 05:07:04] INFO in schema_manager: Deleted cache by key: hm_entity_prod__neo4j_d0ebafcbe06c9fa2cb739c4a860e79bd, hm_entity_prod__complete_d0ebafcbe06c9fa2cb739c4a860e79bd
Made a GET call right after this PUT call, showed updated status "Submitted"
[2024-02-16 05:07:34] INFO in schema_manager: Deleted cache by key: hm_entity_prod__neo4j_d0ebafcbe06c9fa2cb739c4a860e79bd, hm_entity_prod__complete_d0ebafcbe06c9fa2cb739c4a860e79bd
Made another GET call right after this PUT call, also showed updated status "New"
"status": "New",
"status_history": [
{
"change_timestamp": 1708060024775,
"changed_by_email": "hubmap@hubmapconsortium.org",
"status": "Submitted"
},
{
"change_timestamp": 1708060054425,
"changed_by_email": "hubmap@hubmapconsortium.org",
"status": "New"
}
]
@sunset666 the response of the PUT call looks like below:
{
"message": "Dataset of d0ebafcbe06c9fa2cb739c4a860e79bd has been updated"
}
Can you add this response to the pipeline logging? And also log when the next GET call is made just in case the pipeline doesn't actually make a new GET call after the PUT call or the ds rslt
output logging is from a previous result prior to the PUT call. This would help us further debug and find the root case.
@yuanzhou, We are adding "Cache-Control: no-cache" header to the request, just to test if that is something that will fix this issue.
Cache-Control: no-cache" header to the request
Not sure if that'll make a difference, definitely worth to add more loggings. The entity-api has caching mechanism internally but it removes the old entity cache after each PUT update.
3/4/2024 from Sunset:
Still no data to test those updates.
Waiting for a way to replicate from @sunset666
Reported by Sunset on 2/5/2024 on PROD: