ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

SCP1289 - Impaired local intrinsic immunity to SARS-CoV-2 infection in severe COVID-19 #1091

Closed idazucchi closed 1 year ago

idazucchi commented 1 year ago

Project short name:

Shalek-Human-SeqWellS3

Primary Wrangler: @arschat

Secondary Wrangler: Ida

Associated files

Published study links

Key Events

arschat commented 1 year ago

Will use 1 cell suspension for each specimen after this slack discussion.

arschat commented 1 year ago

After submitting, get the following error in ingest: Project is Invalid. Please go back and edit the project. * should NOT have additional properties at root of document

Reason: bionetwork fields in spreadsheet was misspelled as project.hca_bionetwork instead of project.hca_bionetworks

anu-shiva commented 1 year ago

edit the project fields through API and to edit the template

arschat commented 1 year ago

Deleted wrong project fields through API:

proj_url = 'https://api.ingest.archive.data.humancellatlas.org/projects/64267858c99baf0c6c9f23bd'
result = requests.get(proj_url).json()
del result['content']['hca_bionetwork']
response = requests.post(proj_url, headers=headers_json, json=result)

Should not use post, use patch command instead

And resubmitted with correct fields.

Still got error since "Lung Atlas" is not in the hca_tissue_atlas enum, but decided to skip schema patch update and do not fill the hca_tissue_atlas and hca_tissue_atlas_version since it has not been decided yet how the atlas is going to be named.

arschat commented 1 year ago

New submission is now graph valid, and ready for secondary review.

idazucchi commented 1 year ago

Hi Arsenios! The dataset looks great I only have two comments:

Specimen

Analysis protocol

Analysis file

arschat commented 1 year ago

CellxGene wrangling requirements

(genes are in gene symbol format in the count matrices)

arschat commented 1 year ago

Applied sec review suggestions (thanks Ida), and submitted. Input form to be filled when exported.

arschat commented 1 year ago

Input form sent

ESapenaVentura commented 1 year ago

Updated with hca tissue atlas and version

ESapenaVentura commented 1 year ago

Error in R28 - Neither project nor supplementary file metadata have submission_date stablished (Value: null). This may have been a side effect of programmatic updates.

Actions:

amnonkhen commented 1 year ago

I believe we have a gap in our validation process. Our metadata validation should have caught this error. Please create a bug ticket.

ESapenaVentura commented 1 year ago

yep - system fields are not validated because they are generated on the go

ESapenaVentura commented 1 year ago

Operational fix - Overwriting metadata in the staging area

# Project metadata
gsutil cat gs://broad-dsp-monster-hca-prod-ebi-storage/prod/111d272b-c25a-49ac-9b25-e062b70d66e0/metadata/project/111d272b-c25a-49ac-9b25-e062b70d66e0_2023-05-22T10:31:35.590000Z.json | jq -jc '.provenance.submission_date = "2023-03-31T06:06:350Z"' > 111d272b-c25a-49ac-9b25-e062b70d66e0_2023-05-22T10:31:35.590000Z.json
gsutil cp 111d272b-c25a-49ac-9b25-e062b70d66e0_2023-05-22T10:31:35.590000Z.json gs://broad-dsp-monster-hca-prod-ebi-storage/prod/111d272b-c25a-49ac-9b25-e062b70d66e0/metadata/project/111d272b-c25a-49ac-9b25-e062b70d66e0_2023-05-22T10:31:35.590000Z.json

# Supplementary file (Spreadsheet)

gsutil cat gs://broad-dsp-monster-hca-prod-ebi-storage/prod/111d272b-c25a-49ac-9b25-e062b70d66e0/metadata/supplementary_file/9ef0a3c2-9d3b-52d0-8ae8-3f968da4c390_2023-05-22T11:15:30.018000Z.json | jq -jc '.provenance.submission_date = "2023-03-31T06:06:350Z"' > 9ef0a3c2-9d3b-52d0-8ae8-3f968da4c390_2023-05-22T11:15:30.018000Z.json
gsutil cp 9ef0a3c2-9d3b-52d0-8ae8-3f968da4c390_2023-05-22T11:15:30.018000Z.json gs://broad-dsp-monster-hca-prod-ebi-storage/prod/111d272b-c25a-49ac-9b25-e062b70d66e0/metadata/supplementary_file/9ef0a3c2-9d3b-52d0-8ae8-3f968da4c390_2023-05-22T11:15:30.018000Z.json
ESapenaVentura commented 1 year ago

This fixes the issue for the strict timeline

ofanobilbao commented 1 year ago

@ESapenaVentura to confirm if we need to do the changes in Ingest. Otherwise this ticket could move to close right?

ofanobilbao commented 1 year ago

Was this resolved by Amnon as part of the R28 errors remaining to be sorted in Ingest that Amnon did last week? @ESapenaVentura ? Trying to understand if we can close this

idazucchi commented 1 year ago

missing submissionDate was add back to ingest by Amnon - this ticket can be closed!