Open ardydavari opened 2 years ago
Hi Ardy, Thanks for your interest in decifer.
That is correct that vcf_2_decifer.py expects each sample to have its own column in a single VCF file. Currently, mutect2 and strelka2 both support multi-sample (or joint) calling of somatic SNVs and produce VCF files with a separate column per sample.
However, I can look into options to also allow single-sample somatic VCFs, all from the same patient. This would require additionally having access to mpileup files generated from the BAMs in order to get read counts for reference and alternate alleles at each site in which a variant was called in at least one sample.
Sincerely, Brian
Hi Brian,
Thank you for your help! I will definitely look into the multi-sample mode for mutect2, as that may be the best option.
I do think that it would be nice to have the second option as well (especially for some alternate callers out there like CaVEMan). Of note I've seen some other methods tackle this problem by imputing reads from the geometric mean of reads (although this is less desirable).
Sincerely, Ardy
Hi,
I would like to run decifer on my dataset, but I had some questions about the preprocessing step.
It seems that vcf_2_decifer.py expects each sample to have its own column in a single VCF?
Most of the pipelines I have run have produced a single VCF that is compared to the matched normal. Would there be a recommended way to complete the merging process, so that the reference reads for private variants in other samples are calculated correctly?
Thank you