Illumina / Pisces

Somatic and germline variant caller for amplicon data. Recommended caller for tumor-only workflows.
GNU General Public License v3.0
94 stars 16 forks source link

Hi, how to deal with multiple variants in a small region #50

Open ShannonDaddy opened 4 years ago

ShannonDaddy commented 4 years ago

Hi, how to deal with multiple variants in a small region, is there a way to combine them to one variant by Pisces or other tools? Attached file describes the situation we find about EGFR exon 19 deletions, we get three separate variants with exactly the same variant frequecies, actually they should be from the same deletion. image

tamsen commented 4 years ago

Hi. Yes, you can use Scylla for that. It uses the Pisces vcf + bam to mine for evidence that the variants are always in phase.

https://github.com/Illumina/Pisces/wiki/Scylla-5.2.10-Design-Document https://github.com/Illumina/Pisces/wiki/Suggested-Pipeline-Configuration-5.2.10

ShannonDaddy commented 4 years ago

Hi. Yes, you can use Scylla for that. It used the Pisces vcf + bam to mine for evidence that the variants are always in phase.

https://github.com/Illumina/Pisces/wiki/Scylla-5.2.10-Design-Document https://github.com/Illumina/Pisces/wiki/Suggested-Pipeline-Configuration-5.2.10

Thanks,I'll try Scylla out.

ShannonDaddy commented 4 years ago

image Hi, after using Scylla I got the result as in the image, but I find some bases are represented by 'R', why not the actual bases, and the INFO fields in the original vcf are lost, is there some options to set to show the actual bases and the INFO fields? The original and phased vcf files are attached! @tamsen

Thanks a lot!

T438267.zip

tamsen commented 4 years ago

Hi - yes, if you dont give scylla a genome file, it just uses "R" for the unknown bases between phased variants. Give it -g {path/to/genome} or similar to get rid of the Rs. If you run Scylla with -h or no arguments, it will tell you all the command options. I dont remember them all off the top of my head, but hopefully you will find what you are looking for.

ShannonDaddy commented 4 years ago

Hi - yes, if you dont give scylla a genome file, it just uses "R" for the unknown bases between phased variants. Give it -g {path/to/genome} or similar to get rid of the Rs. If you run Scylla with -h or no arguments, it will tell you all the command options. I dont remember them all off the top of my head, but hopefully you will find what you are looking for.

Thanks a lot!The unknown bases represented by 'R's were solved by giving the genome file, but I don't find any option to keep the info fields in the orginal vcf file, I wrote a script to extract the info fields from the input vcf and add them to the phased vcf.