DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 113 forks source link

Phased SNP information for allele specific RNA-seq analysis? #256

Open zhang-jiankun opened 4 years ago

zhang-jiankun commented 4 years ago

Hi,

I'm trying to resolve different alleles in single-cell RNA-seq data. I see if a read involves a SNP, it will be marked in the SAM file. Can we incorporated phased SNP information for allele specific gene expression analysis, using hisat2?

Seems feasible to me if we build genome index using snp file with each line like below. That is, base composition at heterozygous loci can be represented by two lines, one for maternal allele and the other for paternal allele. However, I'm not sure if this will cause unknown problems. I would appreciate it if you have any suggestions.

pat_1 single chr1 740738 T
mat_1 single chr1 740738 C pat_2 single chr1 770502 A
mat_2 single chr1 770502 G pat_3 single chr1 792149 A
mat_3 single chr1 792149 G pat_4 single chr1 797392 G

Thanks!