Currently the workflow only supports handling one-sample VCF files.
Multi-sample VCF files are frequently produced e.g. in tumor-normal callsets and therefore a common input for the workflow.
The workflow should not split these VCF files into one-sample VCF files and run the workflow, since information from multiple samples are usually beneficial for interpreting the variants. For example, a variant within the tumor sample can also be a germline mutation, if the allele frequency in the normal sample is also high. Hence, the multiple samples should be summarized within the same report.
The following changes and implementation would be necessary:
all vembrane processes: add handling of sample-specific values by vembrane (custom-filters, TSV conversion)
allele fraction and depth handling in TSV conversion: Has to be sample-specific as well.
datavzrd: Add adaptation of annotation_colinfo.tsv to account for sample-specific information (e.g., allele fraction, but also format column information)
Ideally samples should be automatically detected and some parameters (as allele fraction and depth) should be added as default for all samples into the report. A warning message in the multiQC should indicate the number of detected samples.
Description of feature
Currently the workflow only supports handling one-sample VCF files. Multi-sample VCF files are frequently produced e.g. in tumor-normal callsets and therefore a common input for the workflow. The workflow should not split these VCF files into one-sample VCF files and run the workflow, since information from multiple samples are usually beneficial for interpreting the variants. For example, a variant within the tumor sample can also be a germline mutation, if the allele frequency in the normal sample is also high. Hence, the multiple samples should be summarized within the same report.
The following changes and implementation would be necessary: