ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Majlinda Lako - MRC scCornea. #96

Open ESapenaVentura opened 4 years ago

ESapenaVentura commented 4 years ago

Continued from https://github.com/HumanCellAtlas/hca-data-wrangling/issues/232

Primary Wrangler: Ami Secondary Wrangler: Enrique

Associated Markdown Found here

Google sheet: https://docs.google.com/spreadsheets/d/11CLPU-rZrK7cH5YgepKJasm5UMNO32y8Zx2UuHehJc8/edit#gid=253173527

Latest Google Sheet Version: https://docs.google.com/spreadsheets/d/1Gi-HT8_D-TYIHN4C4kRUW7_HtMkcMRHLNBrsA4LQrGM/edit#gid=65395503

Publication: https://www.sciencedirect.com/science/article/pii/S1542012421000215?via%3Dihub

Key Events

Please track the below as well as the key events:

  1. Track date first spreadsheet received and final spreadsheet sent by editing ticket to include date next to event.
  2. Track spreadsheet iterations by placing asterisks next to receive spreadsheet event.
  3. Track any metadata issues/tickets made for dataset with a bulleted list of links under received spreadsheet event. Links should be to the ticket in the metadata repo.

24/06/2019: Sent e-mail to plan catch-up call 25/06/2019: Set up call for July 4th, 2019

24/06/2020: Sent spreadsheet and instructions for filling the spreadsheet and start data upload

ESapenaVentura commented 4 years ago

Received an email on July 22, due to time constraints they are going to submit to SRA but will come back to us before the end of the year to submit to DCP.

ESapenaVentura commented 4 years ago

BioRxiv manuscript here: https://www.biorxiv.org/content/10.1101/2020.07.09.195438v1

ESapenaVentura commented 4 years ago

There is not enough metadata to begin wrangling this dataset. Once the manuscript is ready or they come back to us, I will begin working on this. For now I'll put it in the icebox

rays22 commented 3 years ago

There is a recently published HCA paper that looks related to the pre-print above: Collin J, Queen R, Zerti D, et al. A single cell atlas of human cornea that defines its development, limbal progenitor cells and their interactions with the immune cells. The Ocular Surface. 2021 Apr. DOI: 10.1016/j.jtos.2021.03.010 PMID: 33865984

rays22 commented 3 years ago

Data availability: GSE155683

ESapenaVentura commented 3 years ago

This dataset contains ATAC-Seq, which we can't push to the DCP until we get the Read3 enum value

ami-day commented 2 years ago

Wrangling this dataset.

ESapenaVentura commented 2 years ago

I have reviewed the spreadsheet! here are my comments:

Project

Project - Contributors

Donor organism

Specimen from organism

Cell line

Sequence file

Analysis protocol

Analysis files

Please address these comments and let me know when the spreadsheet is ready for another review :)

ami-day commented 2 years ago

To do: add new ontology term for "dysplasia". Keep "cornea neoplasm" for now and make update when the ontology term is available.

ami-day commented 2 years ago

Requested the fastq files from NCBI cloud delivery service. Only 1 fastq file is available via ENA. Hopefully NCBI will provide the paired-end reads.

ami-day commented 2 years ago

Thanks @ESapenaVentura , I have made all your suggested changes except:

Here is the updated version if you could re-review to double-check all ok, thanks! https://docs.google.com/spreadsheets/d/1Gi-HT8_D-TYIHN4C4kRUW7_HtMkcMRHLNBrsA4LQrGM/edit#gid=65395503

ami-day commented 2 years ago

Next ontology term request: https://github.com/HumanCellAtlas/ontology/issues/100

ami-day commented 2 years ago

upload area: 75b63f7f-1eb8-473c-b751-8b9376563c22

ami-day commented 2 years ago

Submitted.

ami-day commented 2 years ago

Assigned E-HCAD id: E-HCAD-46 Pre-converted the files which are stored here: https://drive.google.com/drive/folders/1MfRn7erOuzDpiASZfSYUVZvgw3uqWyfd

ESapenaVentura commented 2 years ago

Same comment as in here https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/3#issuecomment-1058135936

ami-day commented 2 years ago

Re-assigning this dataset to E-HCAD-50.

ami-day commented 2 years ago

Manually curated the idf and sdrf files and uploaded them to Gitlab: https://gitlab.ebi.ac.uk/ebi-gene-expression/scxa-metadata/-/merge_requests/294

E-HCAD-50

ami-day commented 2 years ago

This dataset needs updating. The library prep. method should be 10X 3' v3 not Drop-seq. I edited it in ingest and now it is raising an error because no spatial barcode info is added. However this is not a 10X visium dataset: https://contribute.data.humancellatlas.org/submissions/detail?uuid=50d64edf-6fbe-48ea-89aa-80f3543dd077&project=6ac8e777-f9a0-4288-b5b0-446e8eba3078

ami-day commented 2 years ago

Submitted update.

ami-day commented 2 years ago

Submitted updates.

ami-day commented 2 years ago

exported.

ami-day commented 2 years ago

looks as expected in the hca data portal.