SciCrunch / sparc-curation

code and files for SPARC curation workflows
MIT License
14 stars 12 forks source link

Organ information in metadata is blank for all datasets #54

Closed jgrethe closed 4 years ago

jgrethe commented 4 years ago

wget https://cassava.ucsd.edu/sparc/exports/curation-export.json

Looks like there may be an issue with compiling the organ information in the JSON/TTL output. They are all blank. grep “\”organ\“:” curation-export.json “organ”: [], “organ”: [], “organ”: [], “organ”: [], “organ”: [], “organ”: [], “organ”: [], “organ”: [], …

tgbugs commented 4 years ago

@bandrow confirming the issue as we discussed today.

SELECT DISTINCT
(COUNT (DISTINCT ?dataset) as ?count_dataset)
WHERE {
    ?dataset isAbout: ?region .
    ?region rdfs:label "heart".
}

Produces zero results in the production ttl release, and 20 in the preview release. Note that querying for ?dataset TEMP:involvesAnatomicalRegion UBERON:0000948 only produces two results in both cases. This is because I have been mapping the organ field to isAbout: not to TEMP:involvesAnatomicalRegion. This is because there are many cases where someone is studying the stellate ganglion and the study itself never involves the heart, yet it is still 'about' how the stellate ganglion modulates the behavior of the heart. This is reflected in the existing competency queries that we test.

See https://github.com/SciCrunch/sparc-curation/blob/799ef1789ae00114e32f12b15a0d7d898de89b67/test/test_data.py#L41-L46 and https://github.com/SciCrunch/sparc-curation/blob/799ef1789ae00114e32f12b15a0d7d898de89b67/sparcur/reports.py#L76.

As discussed, the right thing to do here is to test the current and previous releases at the same time so that we can detect the change.

tgbugs commented 4 years ago

Now fixed in both preview and production release.

jgrethe commented 4 years ago

@tgbugs And production includes the Scaffold updates as well?

tgbugs commented 4 years ago

@jgrethe No, that is still in preview, https://cassava.ucsd.edu/sparc/preview/archive/exports/2020-08-06T03%3A19%3A47%2C781367-07%3A00/. I don't think we can move it out of preview until everyone is ready to transition, where everyone includes, disco, foundry, and blackfynn. Given the order of magnitude increase in the size of the releases we probably also need to start considering how to break them up.

jgrethe commented 4 years ago

Do the preview releases get updated? And if so - is there a latest shortcut URL?

tgbugs commented 4 years ago

Yes and yes. https://cassava.ucsd.edu/sparc/preview/exports/ is the folder and the json file is at https://cassava.ucsd.edu/sparc/preview/exports/curation-export.json. The preview should update shortly after the current production release, but is a bit delayed this week since I'm fixing a few final bugs.

jgrethe commented 4 years ago

OK - let me know when the bugs are fixed and I will re-tool the foundry transform.

bandrow commented 4 years ago

Bug fixed

On Tue, Aug 11, 2020, 11:40 PM Jeffrey S. Grethe, Ph.D. < notifications@github.com> wrote:

OK - let me know when the bugs are fixed and I will re-tool the foundry transform.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SciCrunch/sparc-curation/issues/54#issuecomment-672641525, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNSWNJKYZW3WTG6KSULQ53SAI2PJANCNFSM4PWE7WIQ .

jgrethe commented 4 years ago

@tgbugs - Is the preview release ready to be transformed for ABI?

tgbugs commented 4 years ago

@jgrethe yes, it is ready for ABI. Within a single dataset blob they will be in

transform column "$.'scaffolds'[*]";

Example

Blob for the generic scaffold dataset.

{
      "contributors": [
        {
          "contributor_affiliation": "Auckland Bioegineering Institute",
          "contributor_name": "Lin, Mabelle",
          "contributor_orcid_id": {
            "id": "https://orcid.org/0000-0002-6388-2181",
            "label": "Mabelle Yuling Lin",
            "system": "Orcid",
            "type": "identifier"
          },
          "first_name": "Mabelle",
          "id": "https://orcid.org/0000-0002-6388-2181",
          "is_contact_person": true,
          "last_name": "Lin"
        },
        {
          "contributor_affiliation": "Auckland Bioegineering Institute",
          "contributor_name": "Christie, Richard",
          "contributor_orcid_id": {
            "id": "https://orcid.org/0000-0003-4336-4640",
            "label": "Gerald Christie",
            "system": "Orcid",
            "type": "identifier"
          },
          "first_name": "Richard",
          "id": "https://orcid.org/0000-0003-4336-4640",
          "is_contact_person": false,
          "last_name": "Christie"
        },
        {
          "contributor_affiliation": "Auckland Bioegineering Institute",
          "contributor_name": "Hunter, Peter",
          "contributor_orcid_id": {
            "id": "https://orcid.org/0000-0001-9665-4145",
            "label": "Peter Hunter",
            "system": "Orcid",
            "type": "identifier"
          },
          "contributor_role": [
            "PrincipalInvestigator"
          ],
          "first_name": "Peter",
          "id": "https://orcid.org/0000-0001-9665-4145",
          "is_contact_person": false,
          "last_name": "Hunter"
        }
      ],
      "errors": [
        {
          "blame": "stage",
          "message": "'protocol_url_or_doi' is a required property",
          "path": [
            "meta"
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "meta",
            "allOf",
            0,
            "required"
          ],
          "validator": "required",
          "validator_value": [
            "template_schema_version",
            "description",
            "funding",
            "protocol_url_or_doi",
            "number_of_subjects",
            "number_of_samples",
            "award_number",
            "principal_investigator",
            "species",
            "organ",
            "modality",
            "techniques",
            "contributor_count",
            "uri_human",
            "uri_api",
            "files",
            "dirs",
            "size",
            "folder_name",
            "title",
            "template_schema_version",
            "number_of_subjects",
            "number_of_samples",
            "timestamp_created",
            "timestamp_updated",
            "timestamp_updated_contents"
          ]
        },
        {
          "blame": "stage",
          "message": "'species' is a required property",
          "path": [
            "meta"
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "meta",
            "allOf",
            0,
            "required"
          ],
          "validator": "required",
          "validator_value": [
            "template_schema_version",
            "description",
            "funding",
            "protocol_url_or_doi",
            "number_of_subjects",
            "number_of_samples",
            "award_number",
            "principal_investigator",
            "species",
            "organ",
            "modality",
            "techniques",
            "contributor_count",
            "uri_human",
            "uri_api",
            "files",
            "dirs",
            "size",
            "folder_name",
            "title",
            "template_schema_version",
            "number_of_subjects",
            "number_of_samples",
            "timestamp_created",
            "timestamp_updated",
            "timestamp_updated_contents"
          ]
        },
        {
          "blame": "stage",
          "message": "'http://purl.obolibrary.org/obo/UBERON_0001155' is not of type 'object'",
          "path": [
            "meta",
            "organ",
            0
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "meta",
            "allOf",
            0,
            "properties",
            "organ",
            "items",
            "allOf",
            0,
            "type"
          ],
          "validator": "type",
          "validator_value": "object"
        },
        {
          "blame": "stage",
          "message": "[] is too short",
          "path": [
            "meta",
            "techniques"
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "meta",
            "allOf",
            0,
            "properties",
            "techniques",
            "minItems"
          ],
          "validator": "minItems",
          "validator_value": 1
        },
        {
          "blame": "stage",
          "message": "{'acknowledgements': ... 1337 bytes later ... 98b3-e48b68f65ec9'} is not valid under any of the given schemas",
          "path": [
            "meta"
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "meta",
            "allOf",
            1,
            "anyOf"
          ],
          "validator": "anyOf",
          "validator_value": [
            {
              "required": [
                "subject_count"
              ]
            },
            {
              "required": [
                "sample_count"
              ]
            }
          ]
        },
        {
          "blame": "stage",
          "message": "'protocol_url_or_doi' is a required property",
          "path": [
            "inputs",
            "dataset_description_file"
          ],
          "pipeline_stage": "IrToExportJsonPipeline.data",
          "schema_path": [
            "allOf",
            0,
            "properties",
            "inputs",
            "properties",
            "dataset_description_file",
            "required"
          ],
          "validator": "required",
          "validator_value": [
            "template_schema_version",
            "name",
            "description",
            "funding",
            "protocol_url_or_doi",
            "contributors",
            "number_of_subjects",
            "number_of_samples"
          ]
        }
      ],
      "id": "N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
      "inputs": {
        "dataset_description_file": {
          "acknowledgements": "Marthe Howard, Wang Li-xin",
          "contributors": [
            {
              "contributor_affiliation": "Auckland Bioegineering Institute",
              "contributor_name": "Lin, Mabelle",
              "contributor_orcid_id": "https://orcid.org/0000-0002-6388-2181",
              "contributor_role": [
                "Creator"
              ],
              "is_contact_person": true
            },
            {
              "contributor_affiliation": "Auckland Bioegineering Institute",
              "contributor_name": "Christie, Richard",
              "contributor_orcid_id": "https://orcid.org/0000-0003-4336-4640",
              "contributor_role": [
                "Creator"
              ],
              "is_contact_person": false
            },
            {
              "contributor_affiliation": "Auckland Bioegineering Institute",
              "contributor_name": "Hunter, Peter",
              "contributor_orcid_id": "https://orcid.org/0000-0001-9665-4145",
              "contributor_role": [
                "PrincipalInvestigator"
              ],
              "is_contact_person": false
            }
          ],
          "description": "Annotated mouse colon scaffold available for registration of segmented neural anatomical-functional mapping of enteric neural circuits.",
          "errors": [
            {
              "blame": "stage",
              "message": "'protocol_url_or_doi' is a required property",
              "pipeline_stage": "DatasetDescriptionFilePipeline.data",
              "schema_path": [
                "required"
              ],
              "validator": "required",
              "validator_value": [
                "template_schema_version",
                "name",
                "description",
                "funding",
                "protocol_url_or_doi",
                "contributors",
                "number_of_subjects",
                "number_of_samples"
              ]
            }
          ],
          "funding": [
            "OT3OD025349"
          ],
          "keywords": [
            "colon",
            "mouse",
            "mesenteric zone",
            "transverse colon",
            "proximal colon",
            "distal colon"
          ],
          "links": [
            {
              "additional_links": "https://github.com/ABI-Software/scaffoldmaker",
              "link_description": "Link to GitHub repository for scaffold"
            }
          ],
          "name": "Generic mouse colon scaffold",
          "number_of_samples": 0,
          "number_of_subjects": 0,
          "template_schema_version": "1.2.3",
          "title_for_complete_data_set": "Generic colon scaffolds"
        },
        "manifest_file": [
          {
            "contents": {
              "manifest_records": [
                {
                  "additional_types": "inode/vnd.abi.scaffold+directory",
                  "description": "3d scaffolds folder",
                  "file_type": "folder",
                  "filename": "Scaffold",
                  "organ": "UBERON:0001155",
                  "species": "NCBITaxon:10090"
                }
              ]
            },
            "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
            "dataset_relative_path": "derivative/manifest.xlsx",
            "mimetype": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            "remote_id": "package:3cc305fd-b2a0-4ba6-92be-48346dae9cbf",
            "type": "path",
            "uri_api": "https://api.blackfynn.io/packages/N:package:3cc305fd-b2a0-4ba6-92be-48346dae9cbf/files/1182587",
            "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/viewer/N:package:3cc305fd-b2a0-4ba6-92be-48346dae9cbf"
          },
          {
            "contents": {
              "manifest_records": [
                {
                  "description": "Contains node and element information",
                  "file_type": "exf",
                  "filename": "mouseColon.exf",
                  "organ": "UBERON:0001155",
                  "species": "NCBITaxon:10090"
                },
                {
                  "description": "Contains commands to output scaffold on cmgui",
                  "file_type": "cmgui",
                  "filename": "mouseColonScaffold.cmgui",
                  "organ": "UBERON:0001155",
                  "species": "NCBITaxon:10090"
                }
              ]
            },
            "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
            "dataset_relative_path": "primary/manifest.xlsx",
            "mimetype": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            "remote_id": "package:3ed55ce1-fd19-4bd9-9ab8-708b67094b73",
            "type": "path",
            "uri_api": "https://api.blackfynn.io/packages/N:package:3ed55ce1-fd19-4bd9-9ab8-708b67094b73/files/1182588",
            "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/viewer/N:package:3ed55ce1-fd19-4bd9-9ab8-708b67094b73"
          }
        ],
        "remote_dataset_metadata": {
          "contributors": [
            {
              "email": "mabelle.lin@auckland.ac.nz",
              "firstName": "Mabelle",
              "id": 475,
              "lastName": "Lin",
              "orcid": "0000-0002-6388-2181",
              "userId": 670
            },
            {
              "email": "r.christie@auckland.ac.nz",
              "firstName": "Richard",
              "id": 476,
              "lastName": "Christie",
              "orcid": "0000-0003-4336-4640"
            },
            {
              "email": "p.hunter@auckland.ac.nz",
              "firstName": "Peter",
              "id": 477,
              "lastName": "Hunter",
              "orcid": "0000-0001-9665-4145"
            }
          ],
          "description": "Annotated mouse colon scaffold available for registration of segmented neural anatomical-functional mapping of enteric neural circuits.",
          "doi": "https://doi.org/10.26275/dwly-naxx",
          "id": "N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
          "license": "Creative Commons Attribution",
          "name": "Generic mouse colon scaffold",
          "package_counts": {
            "Collection": 3,
            "Image": 1,
            "Unknown": 2,
            "Unsupported": 11
          },
          "readme": {
            "readme": "**Study Purpose:** The goal of this work is to create annotated generic mouse scaffolds for registration of segmented data obtained by experimental groups.\n\n**Data Collected:** The generic scaffold is created based on the general description and average dimensions of mouse colons as measured by experimental groups who provide data to be registered on the scaffold.\n\n**Primary Conclusion:** None stated\n\n---\n**Curator\u2019s Notes:**\n\n**Experimental design**: Not applicable. \n\n**Completeness:** The study is ongoing and potentially will link to other datasets where the data is used for mapping onto the scaffold.\n\n**Subjects & Samples:** The generic scaffold is not subject/sample specific but represents an average colon of subjects used in other studies.\n\n**Primary vs derivative:** In the primary folder, nodal information and element connectivity of the scaffold are formatted as an exf file. Using the commands in the cmgui file, the scaffold can be visualized using open source software CMGUI. The derivative folder contains json files which are used to generate a visualisation of the scaffold on the web portal.\n\n**Code availability:** https://github.com/ABI-Software/scaffoldmaker ABI-Software/scaffoldmaker Anatomical scaffold generator using OpenCMISS"
          },
          "status-log": {
            "entries": [
              {
                "status": {
                  "displayName": "12. Published (Investigator)",
                  "id": 13,
                  "name": "12_PUBLISHED_INVESTIGATOR"
                },
                "updatedAt": "2020-08-04T18:29:58.809324Z",
                "user": {
                  "firstName": "Anna (Anka)",
                  "lastName": "Pilko",
                  "nodeId": "N:user:67103fc6-3507-4334-9dc6-bdeb0c27e9a4"
                }
              },
              {
                "status": {
                  "displayName": "11. Complete, Under Embargo (Investigator)",
                  "id": 12,
                  "name": "11_COMPLETE_UNDER_EMBARGO_INVESTIGATOR"
                },
                "updatedAt": "2020-04-22T22:14:41.413709Z",
                "user": {
                  "firstName": "Jesse",
                  "lastName": "Khorasanee",
                  "nodeId": "N:user:8dc75e38-ceb6-4300-a718-58a2dbf6c257"
                }
              },
              {
                "status": {
                  "displayName": "01. Template Dataset (Default)",
                  "id": 2,
                  "name": "01_TEMPLATE_DATASET_DEFAULT"
                },
                "updatedAt": "2020-04-22T05:05:24.12754Z",
                "user": {
                  "firstName": "Mabelle",
                  "lastName": "Lin",
                  "nodeId": "N:user:f898a6c3-99b1-4bda-b974-11123906f32b"
                }
              }
            ],
            "limit": 25,
            "offset": 0,
            "totalCount": 3
          },
          "tags": [
            "mesenteric zone",
            "colon",
            "mouse",
            "proximal colon",
            "transverse colon",
            "distal colon"
          ]
        },
        "submission_file": {
          "submission": {
            "milestone_achieved": "create tools to visualize data",
            "milestone_completion_date": "2020-01-30T00:00:00",
            "sparc_award_number": "OT3OD025349"
          }
        }
      },
      "meta": {
        "acknowledgements": "Marthe Howard, Wang Li-xin",
        "award_number": "OT3OD025349",
        "contributor_count": 3,
        "description": "Annotated mouse colon scaffold available for registration of segmented neural anatomical-functional mapping of enteric neural circuits.",
        "dirs": 3,
        "doi": {
          "category": "Dataset",
          "id": "https://doi.org/10.26275/dwly-naxx",
          "label": "Generic mouse colon scaffold",
          "system": "Doi",
          "type": "identifier"
        },
        "files": 14,
        "folder_name": "Generic mouse colon scaffold",
        "funding": [
          "OT3OD025349"
        ],
        "keywords": [
          "colon",
          "mouse",
          "mesenteric zone",
          "transverse colon",
          "proximal colon",
          "distal colon"
        ],
        "modality": [
          "models"
        ],
        "number_of_samples": 0,
        "number_of_subjects": 0,
        "organ": [
          "http://purl.obolibrary.org/obo/UBERON_0001155"
        ],
        "principal_investigator": "/contributors/2",
        "size": 3677843,
        "techniques": [],
        "template_schema_version": "1.2.3",
        "timestamp_created": "2020-04-22T05:05:24,127540Z",
        "timestamp_updated": "2020-08-04T18:29:58,809324Z",
        "timestamp_updated_contents": "2020-08-06T06:41:00,284912Z",
        "title": "Generic mouse colon scaffold",
        "title_for_complete_data_set": "Generic colon scaffolds",
        "uri_api": "https://api.blackfynn.io/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
        "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9"
      },
      "path_metadata": [
        {
          "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
          "dataset_relative_path": "derivative/Scaffold",
          "manifest_record": {
            "additional_types": "inode/vnd.abi.scaffold+directory",
            "description": "3d scaffolds folder",
            "file_type": "folder",
            "filename": "Scaffold",
            "organ": "UBERON:0001155",
            "species": "NCBITaxon:10090"
          },
          "mimetype": "inode/vnd.abi.scaffold+directory",
          "prov": "derivative/manifest.xlsx",
          "remote_id": "collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48",
          "type": "path",
          "uri_api": "https://api.blackfynn.io/packages/N:collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48",
          "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/files/N:collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48"
        },
        {
          "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
          "dataset_relative_path": "primary/mouseColon.exf",
          "manifest_record": {
            "description": "Contains node and element information",
            "file_type": "exf",
            "filename": "mouseColon.exf",
            "organ": "UBERON:0001155",
            "species": "NCBITaxon:10090"
          },
          "prov": "primary/manifest.xlsx",
          "remote_id": "package:80b81c85-0d4a-4134-9bd9-fa3fee91a254",
          "type": "path",
          "uri_api": "https://api.blackfynn.io/packages/N:package:80b81c85-0d4a-4134-9bd9-fa3fee91a254/files/1182507",
          "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/viewer/N:package:80b81c85-0d4a-4134-9bd9-fa3fee91a254"
        },
        {
          "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
          "dataset_relative_path": "primary/mouseColonScaffold.cmgui",
          "manifest_record": {
            "description": "Contains commands to output scaffold on cmgui",
            "file_type": "cmgui",
            "filename": "mouseColonScaffold.cmgui",
            "organ": "UBERON:0001155",
            "species": "NCBITaxon:10090"
          },
          "prov": "primary/manifest.xlsx",
          "remote_id": "package:c4416a8a-e7ae-41ff-9afe-2db9b11726e1",
          "type": "path",
          "uri_api": "https://api.blackfynn.io/packages/N:package:c4416a8a-e7ae-41ff-9afe-2db9b11726e1/files/1182506",
          "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/viewer/N:package:c4416a8a-e7ae-41ff-9afe-2db9b11726e1"
        }
      ],
      "prov": {
        "export_hostname": "athena",
        "export_project_path": "/mnt/str/tom/blackfynn_local/frik/poopfrik/SPARC Consortium",
        "export_system_identifier": "Vhe5hmmFAGxY_hQNCDPT0g",
        "timestamp_export_start": "2020-08-06T10:03:18,308354Z"
      },
      "rmeta": {
        "readme": "**Study Purpose:** The goal of this work is to create annotated generic mouse scaffolds for registration of segmented data obtained by experimental groups.\n\n**Data Collected:** The generic scaffold is created based on the general description and average dimensions of mouse colons as measured by experimental groups who provide data to be registered on the scaffold.\n\n**Primary Conclusion:** None stated\n\n---\n**Curator\u2019s Notes:**\n\n**Experimental design**: Not applicable. \n\n**Completeness:** The study is ongoing and potentially will link to other datasets where the data is used for mapping onto the scaffold.\n\n**Subjects & Samples:** The generic scaffold is not subject/sample specific but represents an average colon of subjects used in other studies.\n\n**Primary vs derivative:** In the primary folder, nodal information and element connectivity of the scaffold are formatted as an exf file. Using the commands in the cmgui file, the scaffold can be visualized using open source software CMGUI. The derivative folder contains json files which are used to generate a visualisation of the scaffold on the web portal.\n\n**Code availability:** https://github.com/ABI-Software/scaffoldmaker ABI-Software/scaffoldmaker Anatomical scaffold generator using OpenCMISS"
      },
      "scaffolds": [
        {
          "dataset_id": "dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9",
          "dataset_relative_path": "derivative/Scaffold",
          "manifest_record": {
            "additional_types": "inode/vnd.abi.scaffold+directory",
            "description": "3d scaffolds folder",
            "file_type": "folder",
            "filename": "Scaffold",
            "organ": "UBERON:0001155",
            "species": "NCBITaxon:10090"
          },
          "mimetype": "inode/vnd.abi.scaffold+directory",
          "organ": "UBERON:0001155",
          "prov": "derivative/manifest.xlsx",
          "remote_id": "collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48",
          "species": "NCBITaxon:10090",
          "type": "path",
          "uri_api": "https://api.blackfynn.io/packages/N:collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48",
          "uri_human": "https://app.blackfynn.io/N:organization:618e8dd9-f8d2-4dc4-9abb-c6aaab2e78a0/datasets/N:dataset:5427de60-5bf8-4617-98b3-e48b68f65ec9/files/N:collection:4017a11f-b644-4b43-beae-1e5cdd2f1a48"
        }
      ],
      "status": {
        "curation_errors": [
          {
            "blame": "stage",
            "message": "'protocol_url_or_doi' is a required property",
            "path": [
              "inputs",
              "dataset_description_file"
            ],
            "pipeline_stage": "PipelineExtras.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "inputs",
              "properties",
              "dataset_description_file",
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "name",
              "description",
              "funding",
              "protocol_url_or_doi",
              "contributors",
              "number_of_subjects",
              "number_of_samples"
            ]
          },
          {
            "blame": "stage",
            "message": "{'folder_name': 'Gen ... 1226 bytes later ... o/UBERON_0001155']} is not valid under any of the given schemas",
            "path": [
              "meta"
            ],
            "pipeline_stage": "PipelineExtras.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              1,
              "anyOf"
            ],
            "validator": "anyOf",
            "validator_value": [
              {
                "required": [
                  "subject_count"
                ]
              },
              {
                "required": [
                  "sample_count"
                ]
              }
            ]
          },
          {
            "blame": "stage",
            "message": "[] is too short",
            "path": [
              "meta",
              "techniques"
            ],
            "pipeline_stage": "PipelineExtras.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "properties",
              "techniques",
              "minItems"
            ],
            "validator": "minItems",
            "validator_value": 1
          },
          {
            "blame": "stage",
            "message": "'species' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "PipelineExtras.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'protocol_url_or_doi' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "PipelineExtras.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'protocol_url_or_doi' is a required property",
            "path": [
              "inputs",
              "dataset_description_file"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "inputs",
              "properties",
              "dataset_description_file",
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "name",
              "description",
              "funding",
              "protocol_url_or_doi",
              "contributors",
              "number_of_subjects",
              "number_of_samples"
            ]
          },
          {
            "blame": "stage",
            "message": "[] is too short",
            "path": [
              "contributors",
              1,
              "contributor_role"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "contributors",
              "items",
              "properties",
              "contributor_role",
              "minItems"
            ],
            "validator": "minItems",
            "validator_value": 1
          },
          {
            "blame": "stage",
            "message": "[] is too short",
            "path": [
              "contributors",
              0,
              "contributor_role"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "contributors",
              "items",
              "properties",
              "contributor_role",
              "minItems"
            ],
            "validator": "minItems",
            "validator_value": 1
          },
          {
            "blame": "stage",
            "message": "{'folder_name': 'Gen ... 1079 bytes later ... tributor_count': 3} is not valid under any of the given schemas",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              1,
              "anyOf"
            ],
            "validator": "anyOf",
            "validator_value": [
              {
                "required": [
                  "subject_count"
                ]
              },
              {
                "required": [
                  "sample_count"
                ]
              }
            ]
          },
          {
            "blame": "stage",
            "message": "'techniques' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'modality' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'organ' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'species' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          },
          {
            "blame": "stage",
            "message": "'protocol_url_or_doi' is a required property",
            "path": [
              "meta"
            ],
            "pipeline_stage": "SPARCBIDSPipeline.data",
            "schema_path": [
              "allOf",
              0,
              "properties",
              "meta",
              "allOf",
              0,
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "description",
              "funding",
              "protocol_url_or_doi",
              "number_of_subjects",
              "number_of_samples",
              "award_number",
              "principal_investigator",
              "species",
              "organ",
              "modality",
              "techniques",
              "contributor_count",
              "uri_human",
              "uri_api",
              "files",
              "dirs",
              "size",
              "folder_name",
              "title",
              "template_schema_version",
              "number_of_subjects",
              "number_of_samples",
              "timestamp_created",
              "timestamp_updated",
              "timestamp_updated_contents"
            ]
          }
        ],
        "curation_index": 14,
        "error_index": 16,
        "path_error_report": {
          "#/": {
            "error_count": 1,
            "messages": [
              "{'size': 3677843, 'd ... 524 bytes later ... ry/manifest.xlsx']} is not valid under any of the given schemas"
            ]
          },
          "#/contributors/-1/contributor_role": {
            "error_count": 2,
            "messages": [
              "[] is too short"
            ]
          },
          "#/inputs/dataset_description_file": {
            "error_count": 2,
            "messages": [
              "'protocol_url_or_doi' is a required property"
            ]
          },
          "#/meta": {
            "error_count": 7,
            "messages": [
              "'modality' is a required property",
              "'organ' is a required property",
              "'protocol_url_or_doi' is a required property",
              "'species' is a required property",
              "'techniques' is a required property",
              "{'folder_name': 'Gen ... 1079 bytes later ... tributor_count': 3} is not valid under any of the given schemas",
              "{'folder_name': 'Gen ... 1226 bytes later ... o/UBERON_0001155']} is not valid under any of the given schemas"
            ]
          },
          "#/meta/techniques": {
            "error_count": 1,
            "messages": [
              "[] is too short"
            ]
          }
        },
        "status_on_platform": {
          "status": {
            "displayName": "12. Published (Investigator)",
            "id": 13,
            "name": "12_PUBLISHED_INVESTIGATOR"
          },
          "updatedAt": "2020-08-04T18:29:58.809324Z",
          "user": {
            "firstName": "Anna (Anka)",
            "lastName": "Pilko",
            "nodeId": "N:user:67103fc6-3507-4334-9dc6-bdeb0c27e9a4"
          }
        },
        "submission_errors": [
          {
            "blame": "stage",
            "message": "'protocol_url_or_doi' is a required property",
            "path": [
              "inputs",
              "dataset_description_file"
            ],
            "pipeline_stage": "DatasetDescriptionFilePipeline.data",
            "schema_path": [
              "required"
            ],
            "validator": "required",
            "validator_value": [
              "template_schema_version",
              "name",
              "description",
              "funding",
              "protocol_url_or_doi",
              "contributors",
              "number_of_subjects",
              "number_of_samples"
            ]
          },
          {
            "blame": "stage",
            "message": "{'size': 3677843, 'd ... 524 bytes later ... ry/manifest.xlsx']} is not valid under any of the given schemas",
            "path": [],
            "pipeline_stage": "DatasetStructurePipeline.data",
            "schema_path": [
              "allOf",
              1,
              "anyOf"
            ],
            "validator": "anyOf",
            "validator_value": [
              {
                "required": [
                  "subjects_file"
                ]
              },
              {
                "required": [
                  "samples_file"
                ]
              }
            ]
          }
        ],
        "submission_index": 2,
        "unclassified_errors": [],
        "unclassified_index": 0,
        "unclassified_stages": []
      },
      "submission": {
        "milestone_achieved": "create tools to visualize data",
        "milestone_completion_date": "2020-01-30T00:00:00",
        "sparc_award_number": "OT3OD025349"
      }
    }