DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 113 forks source link

Question about different behaviour between Tophat2 and Hisat2 #271

Open X-Mialhe opened 3 years ago

X-Mialhe commented 3 years ago

I'm a bioinformatician on a sequencing core facility from Montpellier (France) and I have a few questions about TopHat2 and HISAT2 algorithm.

First, a question regarding uniquely aligned reads. When a read aligns to one unique position in the transcriptome (according to the gtf annotation) but also to another position in the genome, TopHat2 did consider the read as uniquely aligned, but HISAT2 no longer does that. As a consequence, that produces holes in the coverage of the transcripts, especially at the center of exons with low mappability (cf. attached figure).

Secondly, I used the --avoid-peudogene option, as I thought it was a solution to my previous question. Indeed, it does improve the results in terms of exon coverage but it is not documented. Can you explain how does this option work? Why is this option labelled as "experimental" in the HISAT2 website?

We would really like to update our pipeline and change from TopHat2 to HISAT2 but we need to be sure about this option and the exact behaviour of HISAT2 on this situation.

Regards,

Xavier Mialhe Coverage_gene_HMGN2