ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

GSE171668, GSE163530 A single-cell and spatial atlas of autopsy tissues reveals pathology and cellular targets of SARS-CoV-2 #317

Closed rays22 closed 1 year ago

rays22 commented 3 years ago

Google Sheet:

https://docs.google.com/spreadsheets/d/1DKVGFAwP3bcA4Atk9kW2LQe4rKahWW4B/edit#gid=1189631159

Project Submission details:

GSE171668_SingleCellAndSpatialAtlasCovid Project UUID: 61515820-5bb8-45d0-8d12-f0850222ecf0 Submission UUID: 9c4eea83-8939-4232-9568-d31eea06718d API: 626050ad6357205eb7bbc1e5

Primary Wrangler:

Ami

Secondary Wrangler:

Wei

Published study links

Key Events

gabsie commented 2 years ago

@MightyAx to just write an improvement ticket for graph validation - to check the process is linked not only to a sequence file but to an image file as well

MightyAx commented 2 years ago

@ami-day to fill input form to add this project to release 20. Also note the ticket to improve graph validation to catch this issue in future is here ebi-ait/ingest-graph-validator#71

ami-day commented 2 years ago

The project seems to be in "metadata valid" state https://contribute.data.humancellatlas.org/projects/detail?uuid=61515820-5bb8-45d0-8d12-f0850222ecf0&tab=upload as oppose to "exported". I have re-requested graph validation.

MightyAx commented 2 years ago

I imagine graph validation might fail:

MightyAx commented 2 years ago

Checking the logs the project failed graph validation 10am yesterday and took down ingest. I'll try running a manual graph validation on the ec2 now.

ESapenaVentura commented 2 years ago

@ami-day @MightyAx We need to re-export the project. I have noticed that there are double pipes in the computational_method field of the analysis protocols, so we'll need to fix that and re-export

ami-day commented 2 years ago

Checking the logs the project failed graph validation 10am yesterday and took down ingest. I'll try running a manual graph validation on the ec2 now.

Thank you!

idazucchi commented 2 years ago

@MightyAx will set this to metadata valid so that Ami can remove the double pipe

MightyAx commented 2 years ago

I've set this back to metadata valid and restarted state tracking Please do not progress to graph validation or exporting until the current exporting issues have been resolved

gabsie commented 2 years ago

Whenever Ami does updates, she contacts Alexie for a manual graph validation.

ami-day commented 2 years ago

Updates done, and have messaged Alexie to let him know.

MightyAx commented 2 years ago

Using the same method as in ticket #757,

  1. I've emptied the neo4j database
  2. initialised graph validation environment
  3. begun hydration (populating the data).

Last time the hydration toon 9 hours so tomorrow we can check in and hopefully

  1. start graph validation
  2. mark submission as graph valid
MightyAx commented 2 years ago

This project is graph valid:

(ingest-graph-validator) ubuntu@ip-172-31-71-222:~/ingest-graph-validator$ ingest-graph-validator hydrate ingest 9c4eea83-8939-4232-9568-d31eea06718d
22-08-08 11:00:45 [ingest_graph_validator.ingest_graph_validator] - INFO: starting neo4j docker instance
22-08-08 11:00:45 [ingest_graph_validator.ingest_graph_validator] - INFO: attached to backend container [neo4j-server]
22-08-08 11:00:45 [ingest_graph_validator.ingest_graph_validator] - INFO: Connecting to neo4j...
22-08-08 11:00:45 [ingest_graph_validator.hydrators.hydrator] - INFO: Started ingest hydrator for for submission [9c4eea83-8939-4232-9568-d31eea06718d]
22-08-08 11:00:45 [ingest_graph_validator.hydrators.hydrator] - INFO: Found project for submission 61515820-5bb8-45d0-8d12-f0850222ecf0
22-08-08 11:00:45 [ingest_graph_validator.hydrators.hydrator] - INFO: Found submission for project with uuid 9c4eea83-8939-4232-9568-d31eea06718d
22-08-08 12:32:08 [ingest_graph_validator.hydrators.hydrator] - INFO: imported 15737 nodes
22-08-08 12:32:08 [ingest_graph_validator.utils] - INFO: [get_nodes] took [581] ms
22-08-08 13:11:17 [ingest_graph_validator.hydrators.hydrator] - INFO: imported 113372 edges
22-08-08 13:11:17 [ingest_graph_validator.utils] - INFO: [get_edges] took [2349133] ms
22-08-08 20:01:22 [ingest_graph_validator.hydrators.hydrator] - INFO: hydration finished

(ingest-graph-validator) ubuntu@ip-172-31-71-222:~/ingest-graph-validator$ ingest-graph-validator action test graph_test_set
22-08-10 08:33:30 [ingest_graph_validator.ingest_graph_validator] - INFO: starting neo4j docker instance
22-08-10 08:33:30 [ingest_graph_validator.ingest_graph_validator] - INFO: attached to backend container [neo4j-server]
22-08-10 08:33:30 [ingest_graph_validator.ingest_graph_validator] - INFO: Connecting to neo4j...
22-08-10 08:33:30 [ingest_graph_validator.actions.test_action] - INFO: loading tests
22-08-10 08:33:30 [ingest_graph_validator.actions.test_action] - INFO: loaded [16] test queries
22-08-10 08:33:30 [ingest_graph_validator.actions.test_action] - INFO: running tests
22-08-10 08:33:32 [ingest_graph_validator.actions.test_action] - INFO: All tests finished
MightyAx commented 2 years ago

Project Set to GraphValid:

curl -L -X PUT "https://api.ingest.archive.data.humancellatlas.org/submissionEnvelopes/626050ad6357205eb7bbc1e5/commitGraphValidEvent" -H "Authorization: Bearer <snip>"
MightyAx commented 2 years ago

I've exported the project, I can see that it is marked as exported in ingest, and new versions of the analysis_protocols have been uploaded to terra.

ofanobilbao commented 2 years ago

Yesterday I identified that the submission was back to "Metadata valid", and that we had 2 extra duplicates of the project in Ingest. One of them with a "Metadata invalid" submission and one that looked as the originally triaged copy, as all the information on the Project tab was filled out and it was the only one with the tick to be displayed in the Catalogue. After a chat after stand-up today, the status is as follows:

ami-day commented 2 years ago

looks ok

ofanobilbao commented 1 year ago

Was affected by double pipes and removed from DCP Portal. Has been fixed and exported again now for R28. @ESapenaVentura will fill the import form as agreed with Mary Wons that it will go into R28

ofanobilbao commented 1 year ago

Issues were flagged again on indexing of R28. The project was dropped. Fixing is being tracked in an Ops ticket: ebi-ait/hca-ebi-wrangler-central#1127

ofanobilbao commented 1 year ago

Enrique just confirmed he managed to export this dataset again for R29

ofanobilbao commented 1 year ago

Moving to close as Browser team did not flag any double pipes or metadata issues