lh3 / seqtk

Toolkit for processing sequences in FASTA/Q formats
MIT License
1.37k stars 309 forks source link

How to interlace pair-end reads via seqtk and make the headers end in /1 and /2 corresponding to its read orientation (R1 and R2, respectively)? #56

Open yoyohashao opened 9 years ago

yoyohashao commented 9 years ago

Hello, sorry for the empty previous post cuz I tapped Enter key so fast... I am preparing datasets for kmernorm which requires the files look like @seq_1/1 AAAAAAAACCCCCCCTTTTTTTTTGGGGGGGG + &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& @seq_1/2 AAAAAAAACCCCCCCTTTTTTTTTGGGGGGGG + &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& @seq_2/1 AAAAAAAACCCCCCCTTTTTTTTTGGGGGGGG + &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& @seq_2/2 AAAAAAAACCCCCCCTTTTTTTTTGGGGGGGG + &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

And a bioinformatician told me seqtk could archive this. But after screening the examples of seqtk, I didn't find the command. Could someone tell me how to do this in seqtk? Many thanks!

tseemann commented 8 years ago

To interleave PE reads do this:

seqtk mergepe R1.fq,gz R2.fq.gz | gzip > Interleaved.fq.gz