DimmestP / chimera-quantseq

Apache License 2.0
1 stars 1 forks source link

HiSat2 20bp minimum read length #4

Closed DimmestP closed 3 years ago

DimmestP commented 3 years ago

HiSat2 gives an error if any reads less that 20bp are present. I therefore filter all reads post trim to remove any less than 20bp. Could this have ramifications for the analysis?

Apparently it is a bug? Should HiSat2 be updated?

https://github.com/DaehwanKimLab/hisat2/issues/245

DimmestP commented 3 years ago

Apparently Edward is aware of this problem? https://github.com/riboviz/riboviz/issues/188

DimmestP commented 3 years ago

Current bifx HiSat2 version is 2.1

ewallace commented 3 years ago

Two points here. First and less importantly, there are some other versions of hisat2 on bifx, but you have to access them using biocontainers (or maybe conda) and I don't remember how. I forwarded Sam the email conversation I had with the bifx admins.

Second and importantly, a 20nt minimum length shouldn't be a problem. This is 75bp paired end read data, and if we are down to 20nt after trimming then that is most of our data lost. I believe the MultiQC said that most reads were far longer. So filtering reads is fine, hisat2 versions need not block us.

DimmestP commented 3 years ago

I have stuck with the current version of HISAT2 and just removed any reads less than 20bp (the quantseq analysis pipeline recommends you do this anyway)