HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
176 stars 32 forks source link

HiSeq reads length #175

Closed gregoruar closed 3 years ago

gregoruar commented 4 years ago

Hi Hadrien! During the experiment I figured out that with HiSeq platform length of reads equals to 126 bp as opposed to 125 bp stated in tutorial. Could you clarify me this issue?

HadrienG commented 4 years ago

Hi!

This is true for all error models in InSilicoSeq (i.e. MiSeq will produce 301bp reads). Typically on an Illumina instruments you'll do n+1 cycles for a n bp run. The n+1 base can be considered an artefact of Illumina sequencing; some sequencing centres drop the extra base at the demultiplexing step, some don't and you end up with data with n+1 bp.

I'd advise to always remove it at QC, since it usually has a terrible error rate (even if the basecalling tells you otherwise)