bioinformatics-centre / kaiju

Fast taxonomic classification of metagenomic sequencing reads using a protein reference database
http://kaiju.binf.ku.dk
GNU General Public License v3.0
260 stars 68 forks source link

paired and unpaired reads #9

Closed y-mone closed 8 years ago

y-mone commented 8 years ago

Hi, I want to analyse a set of reads with Kaiju. After the trimming step of my paired-end data, a part of the reads lost their mates and I obtain a fastq file with the right mate, a fastq file with the left mate and a fastq file with singletons. Please, could you tell me if is it possible to include all the reads (paired and singleton) in the kaiju analysis ?

Thank you in advance for your reply,

pmenzel commented 8 years ago

Hi, I assume that your file with right mates and the file with left mates contains the same number of reads and the reads are in the same order in both files.

Then you can just run Kaiju twice and combine the output of both runs.

First, run Kaiju with the intact read pairs in paired-end mode:

kaiju -i leftmates.fastq -j rightmates.fastq -o pairedreads.out ...

Second, run the singletons:

kaiju -i singletons.fastq -o singletons.out ...

Then just concatenate them

cat pairedreads.out singletons.out > combined.out

Now you can run kaijuReport or kaiju2krona etc on the combined.out file.