ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Single Cell RNA-seq reveals ectopic and aberrant lung resident cell populations in Idiopathic Pulmonary Fibrosis #750

Open ipediez opened 2 years ago

ipediez commented 2 years ago

Project short name:

lungPopulationsIPF

Primary Wrangler:

All wranglers. Metadata lead: Irene Data lead: Enrique

Secondary Wrangler:

Associated files

Published study links

Key Events

ipediez commented 2 years ago

We started wrangling this dataset as part of the "Wrangler workshop - remove obstacles in the flow"

ESapenaVentura commented 2 years ago

hca-util area: eaf44850-d35f-4ab0-a53e-f81c5f236289

ipediez commented 2 years ago

Fields waiting on secondary review:

gabsie commented 2 years ago

@Wkt8 to secondary review this.

Wkt8 commented 2 years ago

Looks great to me! Fixed a small typo with contributor names in Ingest. Apart from that, it looks great!

MightyAx commented 2 years ago

R16 ReExport Successful

ESapenaVentura commented 2 years ago

@idazucchi to look today @ESapenaVentura: LGTM

arschat commented 1 week ago

After Dave's request in #1270 I am re-opening the current ticket to add fastq files that are now available in GSE136831.

hca-util upload area already contains the analysis files, and fastqs have been uploaded there as well. Discussing with @ESapenaVentura how to proceed with the update, while keeping the same uuids for all other entities.

I downloaded the spreadsheet from ingest's submission, and added the Sequence file tab, and wrangled them. I filled the uuids for the references of the seq file tab as well (cell suspension, library preparation protocol, sequencing protocol).

Submission had errors and we decided to remove and populate a new submission with the existing uuids. Since submission was not being deleted, I removed all entities with the following script, and Enrique added the new submission with same uuids.

delete_all_entities.py ```python from hca_ingest.api.ingestapi import IngestApi import requests token = '' submission_url = 'https://api.ingest.archive.data.humancellatlas.org/submissionEnvelopes/6257fd5c0d00514fc697a047/' api = IngestApi(url="https://api.ingest.archive.data.humancellatlas.org/") api.set_token(f"Bearer {token}") all_href = [] entities = ['biomaterials', 'processes', 'files', 'protocols'] for entity in entities: ent_url = submission_url + entity all_ent = api.get_all(ent_url, entity) all_href.extend([ent['_links']['self']['href'] for ent in all_ent]) payload = {} headers = { 'Authorization': f'Bearer {token}' } for url in all_href: response = requests.request("DELETE", url, headers=headers, data=payload) if not response.ok: print(f'Error in url:{url}') ```
adding_genome_version.py ```python from hca_ingest.api.ingestapi import IngestApi api = IngestApi('https://api.ingest.archive.data.humancellatlas.org/') headers_json = {'Content-Type': 'application/json', 'Authorization': 'Bearer ' + token} count_ontology = { "text": "count matrix", "ontology": "data:3917", "ontology_label": "Count matrix" } files = api.get_all('https://api.ingest.archive.data.humancellatlas.org/submissionEnvelopes/667e9436a28a8668d47199bf/files','files') for file in files: if file['content']['describedBy'] != 'https://schema.humancellatlas.org/type/file/7.0.0/analysis_file': continue if count_ontology in file['content']['file_core']['content_description']: file['content']['genome_assembly_version'] = 'GRCh38' else: file['content']['genome_assembly_version'] = 'Not Applicable' file_href = file['_links']['self']['href'] response = api.patch(file_href, headers=headers_json, json={'content': file['content']}) if not response.ok: print(f'error in {file_href} patching') ```

spreadsheet \w seq files here: 1da128e6-c525-42c0-be81-384998f93781_20240620-155943.xlsx

spreadsheet_updated_with_new_uuids.xlsx

arschat commented 4 days ago

Graph valid & exported. Import form sent