Closed aaclan-ebi closed 2 years ago
Looking at Andrew's original message on slack, it looks like there's another issue we need to resolve.
The exporter is populating the provenance.schema_major_version
, provenance.schema_minor_version
for the metadata JSON files. However, the cell suspension has an older version of the schema which doesn't contain those fields.
{
"describedBy": "https://schema.humancellatlas.org/type/biomaterial/13.1.0/cell_suspension",
"schema_type": "biomaterial",
"biomaterial_core": {
"biomaterial_id": "M_C57BL/6_pancreas_cells_batch2",
"biomaterial_description": "Mouse islets were isolated (and pooled) from five C57BL/6 and ICR mice by perfusion of the common bile duct with 0.8 mM Collagenase P (Roche), digestion of the pancreata with 0.8 mM Collagenase P (Roche) and purification of the islets by Histopaque gradient (Sigma) centrifugation.",
"ncbi_taxon_id": [
10090
]
},
"genus_species": [
{
"text": "Mus musculus",
"ontology": "NCBITaxon:10090",
"ontology_label": "Mus musculus"
}
],
"selected_cell_types": [
{
"text": "pancreatic PP cell",
"ontology": "CL:0002275",
"ontology_label": "pancreatic PP cell"
}
],
"estimated_cell_count": 334,
"provenance": {
"document_id": "02341f59-b12e-4039-83c6-b563919f7845",
"submission_date": "2019-07-04T13:38:38.022Z",
"update_date": "2019-07-04T13:38:44.588Z",
"schema_major_version": 13,
"schema_minor_version": 1
}
}
Options for the solution:
Update the exporter to not populate these provenance fields for metadata JSON's which has an older schema version. Clean up the exported metadata to remove those fields.
Update the schema version of the metadata JSON's.
cc @amnonkhen @clairerye
I believe we need to fix the provenance issue first before asking the Data Import team to reimport this project.
I would advocate for option 1, option 2 seems more on the migration side (Which we'll have to prioritise eventually, but it's a big task). If we are not implementing migrations fully, I think it's more sustainable for the exporter to detect when the fields do not exist
Moving this task to Stalled column after discussing with Claire on Friday. I believe there is a ticket in Dev that will address the issue. And this specific task is blocked/stalled by delivering on the dev ticket first.
ebi-ait/dcp-ingest-central#376 DCP1 project updates to terra
@aaclan-ebi and @jacobwindsor is this effectively done?
Tested on staging and works well
Promoting to prod
Is there an SOP for updating dcp1 projects somewhere? @jacobwindsor
There is not but give me a sec and i'll publish my script
slack: https://embl-ebi-ait.slack.com/archives/C9XD6L0AD/p1625058217091100
A DCP 1 Project with uuid
577c946d-6de5-4b55-a854-cd3fde40bff2
failed importing because there are no data files,As agreed with Data Import team
Tasks:
[ ] Delete the directories:
metadata/sequence_file/
descriptor/
[ ] Make sure data import team imports the staging area successfully
[ ] Write SOP for updating dcp1 projects