secastel / phaser

phasing and Allele Specific Expression from RNA-seq
GNU General Public License v3.0
107 stars 37 forks source link

Prior population phasing with or without WGS? #50

Open snaqvi1990 opened 5 years ago

snaqvi1990 commented 5 years ago

Hi Stephane,

I have WGS and RNA-seq from the same donor (cell line), I'd like to get as long haplotypes as possible. In the docs you say that population based phasing prior to phaser helps a lot -- however, what about population-based phasing with sequencing read (but only WGS)-based phasing (like SHAPEIT2 does) prior to phaser? Not sure if you've tested/have a sense for whether it would be better to include WGS at both steps, or only in the phaser step.

Thanks, Sahin

secastel commented 4 years ago

Hi Sahin, This message totally slipped through the cracks, I apologize for the absurdly long delay in responding. In case this is of any use to you, if you can spare the compute, it would be useful to include the WGS reads while doing the population phasing, using e.g. shapeit2. One issue we've experienced with shapeit2 is that it is extremely compute intensive and slow when doing this, especially with 30x WGS. It may not be viable depending on how many samples you have. In the event that you can't run shapeit with the WGS reads, you can use phaser to correct the shapeit based population phasing with the WGS reads, and it will run much faster.

Best, Stephane

snaqvi1990 commented 4 years ago

Hi Stephane,

No problem. I actually have an updated version/question with this use case. I now have a 10X-phased genome for this donor (i.e. almost perfectly phased across megabase-scale regions), and as such I want to use phaser to just get haplotypic counts for each gene. So my question is if I input the fully phased vcf along with rna-seq, will phaser do anything weird/on top of the current phasing? I would imagine not but just wanted to check. Alternatively is there a way to run just the haplotypic count part of phaser?

Thanks, Sahin