Closed jamesdalg closed 8 months ago
Hi, currently we discard the read names after feature generation, but it is certainly possible to get this information during the runtime. Another option is to write a small utility function that is run afterwards to get the read names for SNPs that you are interested in.
Hi, I have added a script: https://github.com/WGLab/NanoCaller/blob/master/misc/get_SNP_readnames.py
You can run it as python get_SNP_readnames.py --vcf variants.vcf.gz --bam alignments.bam --output read_names
.
The format of the output is: chromosome position allele1:read_name1,read_name2 allele2:read_name3,read_name4 allele3:read_name5
Where allele 1 is reference allele, followed by al the alternative alleles.
wow! Thanks! I'll give it a try.
Is there a way to find which supporting reads contain the SNP in question using nanocaller? Perhaps there are some temp files that are created in the process of SNP calling that help to determine this.