Open tayabsoomro opened 6 years ago
What kind of outputs are you trying to combine? Are they the single sample and you want the report for the sum of the two? Or are you trying to compare them?
So, I have two kraken-style output files generated by performing two DNA classification runs. These outputs show the proportions of reads present in the sample, an example is shown just below:
6.00 600 600 U 0 unclassified
94.00 9400 0 - 1 root
94.00 9400 0 - 131567 cellular organisms
94.00 9400 0 D 2 Bacteria
94.00 9400 0 - 1783272 Terrabacteria group
94.00 9400 0 P 1239 Firmicutes
94.00 9400 0 C 91061 Bacilli
94.00 9400 0 O 1385 Bacillales
94.00 9400 0 F 186817 Bacillaceae
94.00 9400 0 G 1386 Bacillus
94.00 9400 0 S 86661 Bacillus cereus group
94.00 9400 3463 S 1392 Bacillus anthracis
58.71 5871 5871 S 198094200 B.anthracis Ames
0.66 66 66 S 191218100 B.anthracis A2012
Now, imagine the second kraken-style report file having some overlapping species present, and some different species. I would like to generate final kraken-style report from the two previous ones which merges the two data together.
So, for example if there is B. anthracis Ames in the 2nd kraken-style report as well, then it would show it only once in the final kraken-style report with the proportions increased. But if there is another strain in the 2nd kraken-style report under Bacillus anthracis which is not present in the 1st kraken-style report, the final kraken-style report would add that under Bacillus anthracis and update the proportions accordingly.
@tayabsoomro I know this is very late to say but we are working on a set of "Kraken-Tools" that can/will provide additional support for such projects as this.
That is good to hear! Although I ended up creating such a tool myself but it will be great if it is added to Kraken. Thanks.
Hey @tayabsoomro. I'm interested in doing something similar, so could you please share this tool you're referring to?
Thank you!
Hey @tayabsoomro. I'm interested in doing something similar, so could you please share this tool you're referring to? Thank you!
I ended up using the Centrifuge tool and its command centrifuge-kreport
to generate the kraken-style report.
So I combined the multiple centrifuge reports together using python's file append and then once I had the accumulated centrifuge report file, I generated kraken-style report file from it.
Here is the snippet of code that I created, hope it helps:
Hi, I am wondering if there is a functionality to join two kraken output files together?
Thanks, Tayab Soomro.