ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

GSE116222: EpithelialDiversityHealthInflammation #220

Closed ami-day closed 2 years ago

ami-day commented 3 years ago

Wrangler responsible for this dataset/lab:

Primary: Ami

Secondary: Enrique

Description of the task:

GSE116222: EpithelialDiversityHealthInflammation

https://docs.google.com/spreadsheets/d/1vDUU36zcg5h3eUQMqO3GajI8VEKDvuZ0/edit#gid=851060064

Acceptance criteria for the task:

ESapenaVentura commented 3 years ago

I'll add myself as a secondary wrangler as this dataset is under the MRC grant!

lauraclarke commented 3 years ago

@ami-day you should make sure you charge all the time you spent wrangling this dataset to the MRC cost centre, if you don't have access to it (42687) please say and we can sort it

ESapenaVentura commented 3 years ago

I have reviewed the dataset, overall looks great! Just a couple of things I've changed and bit I would like to discuss:

Changes

Project

Enrichment protocol

Discuss

Specimen

Cell suspension

I have saved the new spreadsheet in the folder with the addition of _ES_review to the filename

lauraclarke commented 3 years ago

@ami-day would it be possible for you to charge your work on this dataset to the MRC award in your timesheets?

ami-day commented 3 years ago

Thanks @ESapenaVentura for this review, I'll have a look at this tomorrow and get back to you

rays22 commented 3 years ago

This dataset has been exported to the Terra staging area.

ami-day commented 3 years ago

Converting to SCEA.

lauraclarke commented 3 years ago

@ami-day will the SCEA submission for this project also need to be fixed?

ami-day commented 3 years ago

I'm not sure why it was moved to Finished before, I haven't finished converting it, think it still needed validation and then review by the SCEA team. I will also make the necessary changes to the idf and sdrf files.

ami-day commented 3 years ago

Submitted the metadata update in ingest, so I'm moving this ticket to the scea brokering column.

lauraclarke commented 3 years ago

Did you confirm it was successfully exported and did you complete the export SOP for it?

lauraclarke commented 3 years ago

https://app.zenhub.com/workspaces/operations-5fa2d8f2df78bb000f7fb2b5/issues/ebi-ait/hca-ebi-wrangler-central/280 looks like there are still some export validation needed to confirm this is okay before completing the export SOP

ami-day commented 3 years ago

Status in ingest prod. is 'valid' not 'exported'. I think @aaclan-ebi and @MightyAx are still working on export of this since the updates to the metadata were made?

lauraclarke commented 3 years ago

@aaclan-ebi I think this is dependent on the work you are Alexie are doing to re-arrange existing files in the staging area in order to enable safe updates

Do you have a timeframe for this work being finished

aaclan-ebi commented 3 years ago

I just finished the script to fix this. Just waiting for 1 more PR approval and I plan to run the script and this should be resolved by eod. Apologies for the delay.

aaclan-ebi commented 3 years ago

Btw, Alexie has worked on a different fix which I believe wasn't blocking this ticket. The fix I was working on has now been applied. I asked @ami-day to confirm her updates and file the import request form to the Data Import team.

ami-day commented 3 years ago

@aaclan-ebi this project is in the 'valid' state currently, not exported. Is that how it should be? I guess I need to submit the metadata changes?

ami-day commented 3 years ago

Oh actually there are 2 projects, 1 has been exported https://contribute.data.humancellatlas.org/projects/detail?uuid=c893cb57-5c9f-4f26-9312-21b85be84313

lauraclarke commented 3 years ago

@ami-day are there two versions of this project in the ingest system? do you know which one to delete?

Do the one which is "correct" still need the fixes described in https://app.zenhub.com/workspaces/operations-5fa2d8f2df78bb000f7fb2b5/issues/ebi-ait/hca-ebi-wrangler-central/280

ami-day commented 3 years ago

I deleted the project which was not the correct one (in the 'valid' state). The correct one was in the 'exported' state and is the one I updated and @aaclan-ebi then fixed. I have submitted the import form for this dataset with a note that the metadata has been updated.

ami-day commented 3 years ago

Already converted to SCEA format with E-HCAD-27. It was handed over but is in review and might be for a while because it includes full paths to bam files (no fastq paths are available).

ami-day commented 3 years ago

This dataset has been exported but is missing some DNA content metadata for multiple records. This needs to be updated and re-exported. However it requires the ability to make bulk updates, so I am moving it to the needs update column for now.

rays22 commented 3 years ago

https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/220

lauraclarke commented 3 years ago

@rays22 you seem to have linked the ticket to itself, is there an ops board ticket you meant to link to?

rays22 commented 3 years ago

@rays22 you seem to have linked the ticket to itself, is there an ops board ticket you meant to link to?

Sorry about that. I think the relevant ops board ticket is this: https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/280#issuecomment-832162128

ami-day commented 3 years ago

The original errors were corrected, but the dataset is still missing the file content text and ontology field (value: DNA sequence) in multiple rows (https://data.humancellatlas.org/explore/projects/c893cb57-5c9f-4f26-9312-21b85be84313/project-metadata)

ami-day commented 3 years ago

This update is blocked by our inability to make bulk updates.

ofanobilbao commented 2 years ago

@ami-day does the SCEA submission that you originally did need an update? Or this ticket can be closed?

ami-day commented 2 years ago

This can be closed, the SCEA files do not need to be updated.

ESapenaVentura commented 2 years ago

Re-opening to investigate this https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/554

Wkt8 commented 2 years ago

Does not need any further changes. The data files are technically not inaccurate, simply not processed.