Closed ami-day closed 2 years ago
I'll add myself as a secondary wrangler as this dataset is under the MRC grant!
@ami-day you should make sure you charge all the time you spent wrangling this dataset to the MRC cost centre, if you don't have access to it (42687) please say and we can sort it
I have reviewed the dataset, overall looks great! Just a couple of things I've changed and bit I would like to discuss:
Project
Enrichment protocol
cell size selection method
(just for consistency)Specimen
colon
as the organ, and left the organ part alone. In the dataset I am wrangling, there are several parts of the digestive tract (and even within the colon). I thought about adding digestive system
as the organ and colon
as the organ part. What do you think about it?Cell suspension
I have saved the new spreadsheet in the folder with the addition of _ES_review
to the filename
@ami-day would it be possible for you to charge your work on this dataset to the MRC award in your timesheets?
Thanks @ESapenaVentura for this review, I'll have a look at this tomorrow and get back to you
This dataset has been exported to the Terra staging area.
Converting to SCEA.
@ami-day will the SCEA submission for this project also need to be fixed?
I'm not sure why it was moved to Finished before, I haven't finished converting it, think it still needed validation and then review by the SCEA team. I will also make the necessary changes to the idf and sdrf files.
Submitted the metadata update in ingest, so I'm moving this ticket to the scea brokering column.
Did you confirm it was successfully exported and did you complete the export SOP for it?
https://app.zenhub.com/workspaces/operations-5fa2d8f2df78bb000f7fb2b5/issues/ebi-ait/hca-ebi-wrangler-central/280 looks like there are still some export validation needed to confirm this is okay before completing the export SOP
Status in ingest prod. is 'valid' not 'exported'. I think @aaclan-ebi and @MightyAx are still working on export of this since the updates to the metadata were made?
@aaclan-ebi I think this is dependent on the work you are Alexie are doing to re-arrange existing files in the staging area in order to enable safe updates
Do you have a timeframe for this work being finished
I just finished the script to fix this. Just waiting for 1 more PR approval and I plan to run the script and this should be resolved by eod. Apologies for the delay.
Btw, Alexie has worked on a different fix which I believe wasn't blocking this ticket. The fix I was working on has now been applied. I asked @ami-day to confirm her updates and file the import request form to the Data Import team.
@aaclan-ebi this project is in the 'valid' state currently, not exported. Is that how it should be? I guess I need to submit the metadata changes?
Oh actually there are 2 projects, 1 has been exported https://contribute.data.humancellatlas.org/projects/detail?uuid=c893cb57-5c9f-4f26-9312-21b85be84313
@ami-day are there two versions of this project in the ingest system? do you know which one to delete?
Do the one which is "correct" still need the fixes described in https://app.zenhub.com/workspaces/operations-5fa2d8f2df78bb000f7fb2b5/issues/ebi-ait/hca-ebi-wrangler-central/280
I deleted the project which was not the correct one (in the 'valid' state). The correct one was in the 'exported' state and is the one I updated and @aaclan-ebi then fixed. I have submitted the import form for this dataset with a note that the metadata has been updated.
Already converted to SCEA format with E-HCAD-27. It was handed over but is in review and might be for a while because it includes full paths to bam files (no fastq paths are available).
This dataset has been exported but is missing some DNA content metadata for multiple records. This needs to be updated and re-exported. However it requires the ability to make bulk updates, so I am moving it to the needs update column for now.
@rays22 you seem to have linked the ticket to itself, is there an ops board ticket you meant to link to?
@rays22 you seem to have linked the ticket to itself, is there an ops board ticket you meant to link to?
Sorry about that. I think the relevant ops board ticket is this: https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/280#issuecomment-832162128
The original errors were corrected, but the dataset is still missing the file content text and ontology field (value: DNA sequence) in multiple rows (https://data.humancellatlas.org/explore/projects/c893cb57-5c9f-4f26-9312-21b85be84313/project-metadata)
This update is blocked by our inability to make bulk updates.
@ami-day does the SCEA submission that you originally did need an update? Or this ticket can be closed?
This can be closed, the SCEA files do not need to be updated.
Re-opening to investigate this https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/554
Does not need any further changes. The data files are technically not inaccurate, simply not processed.
Wrangler responsible for this dataset/lab:
Primary: Ami
Secondary: Enrique
Description of the task:
GSE116222: EpithelialDiversityHealthInflammation
https://docs.google.com/spreadsheets/d/1vDUU36zcg5h3eUQMqO3GajI8VEKDvuZ0/edit#gid=851060064
Acceptance criteria for the task:
[x] convert metadata
[x] add ontologies
[x] get secondary review and make necessary changes
[x] validate with graph validator
[x] upload to ingest prod. for validation
[x] get cell type annotations and gene expression matrices