It is desired to have dynamic schemas define the file types that can be submitted for each type of analysis. As a practical example, the file types submitted to sequencing experiments (BAM) are not the same as for variant calls (VCF), and if a new analysis schema was created for Volumetric Imaging files another file type would be required.
Details
Currently the allowed file types are an arbitrary list defined in the base schema. With this feature request, this would be changed such that:
[x] The base schema allows any string value for file types
[x] There is a mechanism in dynamic schemas to define additional restrictions on file.fileType for that Analysis Type
If the user provides a definition for file.fileType it will merge with the definition from the base schema.
This will allow the user to add additional restrictions on file.fileType for this analysis type
I am not sure if the current schema merging process will allow type definitions to be extended/merged in this way. Hopefully some quick trials will determine if this is possible.
If not possible in this way, we should discuss other mechanisms for adding fileType validation rules to a schema. It is always an option that the user provides these additional validations separately from the dynamic schema definition and we programatically add those rules into the merged analysis schema.
Additional context
Once this process is modified, we may also need to make some minor updates to related processes:
[ ] Data migration to update the two legacy analysis types that Song has pre-loaded into the database. The list of valid file types for both sequencing-experiment and variant-calling should be specific to those analysis types, not the full list they are right now.
note re definiton merging: it seems SANBI has run into some issues trying to do that, and they have kindly documented their findings here. Sharing their ticket here in case it helps save some research
Summary of request
It is desired to have dynamic schemas define the file types that can be submitted for each type of analysis. As a practical example, the file types submitted to sequencing experiments (BAM) are not the same as for variant calls (VCF), and if a new analysis schema was created for Volumetric Imaging files another file type would be required.
Details
Currently the allowed file types are an arbitrary list defined in the base schema. With this feature request, this would be changed such that:
file.fileType
for that Analysis TypeDesired solution
fileType
is its own definitionfileType
is that it is a stringfile.fileType
references thefileType
definitionfileType
is a required property offile
file.fileType
it will merge with the definition from the base schema.file.fileType
for this analysis typeI am not sure if the current schema merging process will allow type definitions to be extended/merged in this way. Hopefully some quick trials will determine if this is possible.
If not possible in this way, we should discuss other mechanisms for adding fileType validation rules to a schema. It is always an option that the user provides these additional validations separately from the dynamic schema definition and we programatically add those rules into the merged analysis schema.
Additional context
Once this process is modified, we may also need to make some minor updates to related processes: