Closed pushpa-itagi closed 7 months ago
Hi @pushpa-itagi,
Could you please confirm if you are providing VCF files as input? If VCFs are provided and exome=True, then the mutations are downsampled to the exome regions of the genome (review parameter exome
on SigProfilerMatrixGenerator's README).
Hi, Yes we used vcf files as input. Ah, I see the matrix generator module explains it. Thanks for the quick reply. Also,
The exome regions are defined in the exome directory for the corresponding reference genome. https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/tree/master/SigProfilerMatrixGenerator/references/chromosomes/exome These regions are from SureSelect v7.
If you want to retain all SNVs, run the matrix generator with BED=None. and exome=False. This will disable downsampling to the regions specified in SureSelect.
Please reach out if you have any additional questions.
Hi,
We are using the sigprofiler assignment tool (COSMIC V3.2) to fit the signatures for WES samples. When we set the exome=True parameter, it seems like in the final file which is the Activities.txt some mutations are missed from the input vcf file. For instance, if the input vcf file had 500 SNV's and then if we check in the final Activities.txt file <500 mutations are seen. Not sure what is causing it to miss these mutations, is it possible that the exome-=True removes certain mutations? or the renormalization cannot assign some set of mutations? Please let me know if it there is a param or something that needs to be changed.
Thanks Pushpa Itagi