ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Subsets of ILC3−ILC1-like cells generate a diversity spectrum of innate lymphoid cells in human mucosal tissues #624

Closed ami-day closed 1 year ago

ami-day commented 2 years ago

Short name InnateLymphoidCells

Ingest Publication Subsets of ILC3−ILC1-like cells generate a diversity spectrum of innate lymphoid cells in human mucosal tissues https://www.nature.com/articles/s41590-019-0425-y

Google Sheet

with matrix: https://docs.google.com/spreadsheets/d/11UWdLOj68AJ_akJLN8v3364_TqNLY91BmF8u__5Ndrc/edit#gid=177794954

no matrix: https://docs.google.com/spreadsheets/d/1AOBAETE6XkG8XgHA5M582qPZnSHXroDQLoJxgJ4d4gU/edit#gid=177794954

From the authors The chemistry was 3’ v2 single-cell fastqs were sent to us via their private upload area They were unable to provide some donor metadata, they provided us with the disease status and development stage. They were unable to provide a gene expression matrix and cell annotations file.

ami-day commented 2 years ago

Emailed the authors to ask ask about access to both raw fastq and gene expression matrix data. Also asked about the 10X technology version.

ami-day commented 2 years ago

Authors got back - they will send us the raw fastq and matrices, I will set up an aws transfer.

ami-day commented 2 years ago

Moving this to stalled. The donor metadata is not available in a public database or in the publication supplements. Need to ask the authors about this.

ami-day commented 2 years ago

Natalia has trnasferred the data to the aws upload area we sent.

ami-day commented 2 years ago

upload area: 9269fde6-21ed-4c18-983d-e9d5b5dec37f

ami-day commented 2 years ago

Moving to stalled: waiting to hear back from authors about some donor metadata.

ofanobilbao commented 2 years ago

@ipediez will secondary review this as soon as she finishes her current dataset

Wkt8 commented 2 years ago

Irene will complete secondary review today

ipediez commented 2 years ago

I've noticed you entered some of the ontologies but not others. I've pointed out those that were not filled, just so you don't forget them

Project

Donor organism

Collection protocol

Specimen from organism

Cell suspension

Dissociation protocol

Enrichment protocol

Library preparation protocol

Sequencing protocol

Sequence files

ami-day commented 2 years ago

Thanks @ipediez I have made these changes. It is ready to validate in ingest now. Only potential problem is that, after removing the scRNA-seq fastq files, there is no scRNA-seq matrix, only bulk. I need to ask Natalia again if she has the single-cell matrices, and I'll also check the upload area where she sent me the data to double check.

ami-day commented 2 years ago

waiting to hear back from Tony about this (GDPR requirements)

ami-day commented 2 years ago

Going to submit this without the matrix file.

ami-day commented 2 years ago

Created a copy without a matrix file.

ami-day commented 2 years ago

downloading sra objects from json file (wget)

ami-day commented 2 years ago

baee61c0-c020-4ffa-adae-9df4c1f6fc75

ami-day commented 2 years ago

Uploaded to ingest, need to sync the data files when I have them ready

ami-day commented 2 years ago

a00bb9e1-f062-4e55-b7c2-23e60f5d55a9

ami-day commented 2 years ago

synced fastq

ami-day commented 2 years ago

graph validating

ami-day commented 2 years ago

submitted.

ami-day commented 2 years ago

exported to DCP.

ami-day commented 2 years ago

We now have information that we can upload a matrix requested from the authors for living EU donor datasets. So I will update the project with the expression matrix.

ami-day commented 2 years ago

I added the analysis protocol and analysis file tabs, with the relevant expression matrix, then imported the updated spreadsheet to the project in ingest. I encountered an error: https://contribute.data.humancellatlas.org/submissions/detail?uuid=522ee883-e3da-412f-a463-0112d2f2f63f&project=f4d011ce-d1f5-48a4-ab61-ae14176e3a6e

ami-day commented 2 years ago

We need to be able to create a new analysis file in order to upload the gene expression matrix file. I was able to add a new analysis protocol. I will create a ticket in development - new issues.

Wkt8 commented 2 years ago

Alternatively, would this work by creating a new submission?

Wkt8 commented 2 years ago

As this is already in metadata valid, we would not be able to create a new submission for it. Ami is willing to wait until the ticket in development to add a new file is prioritised.

MightyAx commented 2 years ago

I think you should be able to just add the file to the upload area, provided that the upload area still exists and the submission is in a pre-exporting state (Draft, Metadata Valid).

Otherwise, a new submission to the existing project would be the "old-normal" way of handling this.

ami-day commented 2 years ago

@MightyAx the issue is with adding an Analysis file as a metadata entity to an existing submission, and then, like you said, we could just add the file to the upload area. I think the best approach would be to remove the changes/updates made (i.e. analysis protocol) and set the state back to exported (without exporting any changes). Then, create a new submission with the Analysis file added and submit that.

ami-day commented 2 years ago

Unfrotunately, I am not able to delete the Analysis protocol that was added recently: https://contribute.data.humancellatlas.org/submissions/detail?uuid=522ee883-e3da-412f-a463-0112d2f2f63f&project=f4d011ce-d1f5-48a4-ab61-ae14176e3a6e "It was not possible to delete Protocol: 5ff502a9-77c5-404a-80d9-740ccb870809. Unexpected server error"

ofanobilbao commented 2 years ago

Also this Project is missing you as a curator @ami-day. You need to update

ESapenaVentura commented 2 years ago

@ami-day to send @ESapenaVentura the spreadsheet used for update

ofanobilbao commented 2 years ago

Ami is showing up as part of the contributor list

ami-day commented 2 years ago

updated my role as data curator

ESapenaVentura commented 2 years ago

There are 4 entities to be added:

And 2 options to add them:

  1. Create a new submission:

    • Create a new submission with only analysis file, analysis protocol
    • Create a new process via the UI in the new submission
    • Link the old cell suspensions, old library and sequencing protocol, and new analysis file and analysis protocol manually through the UI
  2. In the old submission:

    • Upload the new analysis file to the hca-util upload area used previously
    • Modify the newly-created draft entity to match with the analysis file metadata (MAY need a dev to set up schema, not sure)
    • Add the new analysis_protocol and a new process and link everything together (library prep, sequencing prep, analysis_protocol, analysis_file)

The first option will also require a dev, as currently the status of the submission is metadata_valid

idazucchi commented 2 years ago

Option 2 doesn't require manual linking! You can upload a spreadsheet using the uuids of existing biomaterials and protocols to link them to the new files, so this update should be fairly easy to do

ofanobilbao commented 1 year ago

@ami-day this needs addressing and update pushed. No blockers

ami-day commented 1 year ago

Working on Option2. Have synced the new analysis file to the s3 bucket. Trying to modify the newly-created draft entity to match with the analysis file metadata - the UI gets stuck when I click "Edit" to edit the new Draft file. Have tried multiple times, and getting this same problem.

ESapenaVentura commented 1 year ago

The UI gets stuck because the form the UI displays is based on the schema loaded via the content.describedBy field

The "draft" file does not have any content (As you can see if you inspect the metadata in the API), so it will need a dev

ami-day commented 1 year ago

The UI gets stuck because the form the UI displays is based on the schema loaded via the content.describedBy field

The "draft" file does not have any content (As you can see if you inspect the metadata in the API), so it will need a dev

Will add this info to the ticket I just created in Ops. Thanks!

ami-day commented 1 year ago

Ticket: https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/966

ofanobilbao commented 1 year ago

In order to update this project 966 needs to be resolved

ami-day commented 1 year ago

Have submitted the updated version with analysis file.

ami-day commented 1 year ago

Exported. Submitted import form.

ofanobilbao commented 1 year ago

In Ami's absence, this looks good to me in the browser, so closing it