Closed adamjtaylor closed 2 years ago
The following attributes are required in the schema and have valid values set
Each have an "Other" value that enables submission of a custom workflow, but do not have a "None"/"Not used"/"NA"/"Not Applicable" option
Options to fix:
@elv-sb what do you think?
I should actually read the source ticket! Proposed solution was actually more drastic
Ok, thanks! Just like the google sheet. Ah I found a bug in the google sheet, there were two columns with same name. OK fixed. There should be one column called “Workflow Type” where they select the type of data (SNPs, germline, or CNVs), and one column named “Workflow Name” where they put a name to the tool-family used in producing the data. One “Workflow URL” column.
The ticket proposed solution is what is in the RFC spreadsheet updated by @Gibbsdavidl in Feb.
Implementing this would assume that only one workflow is run (as per the ticket) Are there cases where multiple workflow types are run for a single Bulk WES L3 file?
Yes, they may run more than one workflow, but the thought was that would produce a separate result and should get it's own metadata. For example, could run a SNP calling workflow and a copy number workflow, and those would be different results and would have different metadata rather than a combined result set. In TCGA copy number and SNP results are separate.
Thanks @Gibbsdavidl, If we implement the changes per the RFC we should do a quick review on what metadata has been submitted for these attributes in the existing template to get a sense of how many centers will need to resubmit metadata to meet the updated schema
Looks like we have no Bulk WES level 3 data yet, so we should be good to implement the changes per the RFC
@elv-sb and I agree that we will action this once the data model to DCA flow is unblocked
Closing as this has now been implemented by setting the workflows to not be required. Reported back to @vthorsson by email. Can re-open if Vanderbilt still have any template issues with metadata submission
Re-opening as per HTAN-35,, we also need to add none into the valid workflow values.
Information
Proposed change:
a clear and consice description of proposed data model ammendments
In the Bulk WES level 3 metadata template all germline, somatic, and structural variants workflows are required for each sample. Each of our samples is analyzed by only one workflow, not all three. Leaving the rest empty, NA, or Not Applicable are not allowed for the rest of the workflows, which is making our metadata invalid. Any suggestions to solve this?
Dicussion link:
Link to where the changes have been discussed/approved eg Closed RFC, Service desk ticket, DCC coordination/operations agenda HTAN-35
How important is this feature?
Select from the options below
When will use cases depending on this become relevant?
Select from the options below
Additional context
Add any other context or screenshots about the feature request here Identified from publication review
Implementation checklist
HTAN.model.csv
editied and pushedHTAN.model.jsonld
validated and created by Github actionmain
ncihtan/HTAN-data-curator
(if required)