GSE109816 GSE121893 - HeartReconstructionPostHF

idazucchi commented 7 months ago

Project short name:

HeartReconstructionPostHF

Primary Wrangler:

Ida

Secondary Wrangler:

Associated files

Google Drive: folder

Published study links

Paper: Single-cell reconstruction of the adult human heart during heart failure and recovery reveals the cellular landscape underlying cardiac function
Accessioned data:
- GSE109816 healthy samples
- GSE121893 disease samples

Ingest

Key Events

[ ] Convert published metadata to HCA spreadsheet
[ ] Manually curate dataset to meet HCA metadata standard
[ ] Collect any matrix and cell-type annotation files
[ ] Are the analysis files suitable for CellxGene? If something is missing get in touch with the authors to request it
[ ] Upload sheet to validate metadata
[ ] Transfer raw files to ingest to validate data files
[ ] Check linking using ingest graph validator
[ ] Ask the Secondary Wrangler for an end-to-end review of the project. Ask the Expertise Wrangler to review specific tabs if needed
[ ] Submit dataset to Production
[ ] Complete the Export SOP
[ ] Convert project data to SCEA format following the SCEA conversion SOP if appropriate

idazucchi commented 7 months ago

Data

I tried to download the data but it failed, I've asked for help to the SRA help desk to solve the error Enrique tried to download both accessions and failed Wei tried to download just one accession and failed

I think this is due to some error in aws - the only thing I can do is wait for SRA's reply

cell suspension

there are too many cs for the anlaysis file input --> need to make plate based CS, but some plate labels are shared between accessions

arschat commented 5 months ago

Ida tried to download the data but it failed, she asked for help to the SRA help desk to solve the error Enrique tried to download both accessions and failed Wei tried to download just one accession and failed

Arsenios tried to download just one accession and failed.

We will try another strategy, to download individual donors by searching donor name in the Run Selector search bar. Healthy donor N2 works, we will continue with other donors and track the progress here.

Healthy individuals

[X] N2
[X] N6
[X] N5
[X] N1
[X] N3
[x] N10
[x] N11
[x] N9
[x] N4
[x] N12
[x] N8
[x] N7 Diseased individuals
[x] C1
[x] C2
[x] D1
[x] D2
[x] D4
[x] D5
[x] N13
[x] N14

arschat commented 4 months ago

All files have been downloaded in the s3://hca-ncbi-cloud-data/. Created an hca-util area for HeartReconstructionPostHF 854f5cac-7550-4369-8491-415bc8f74879.

HeartReconstructionPostHF_SRR_accessions.txt HeartReconstructionPostHF_add_to_hca-util_area.txt HeartReconstructionPostHF_remove_from_cloud_delivery_area.txt

idazucchi commented 4 months ago

The submission is too large to upload to ingest - I've generated uuids for all the entities and I will need Enrique's help to generate the submission

idazucchi commented 4 months ago

generating the submission with a script is not feasible (it takes 3+ days of monitioring the script) so this dataset is stalled until we can address the reason for the timeout (?)or otherwise make sure that ingest can process large submission

ebi-ait / hca-ebi-wrangler-central