Closed jianshu93 closed 3 months ago
I mentioned in the README
that Kun_peng was developed and optimized based on Kraken2, with a focus on reduced memory usage and improved performance. Therefore, the high consistency in results is understandable. The observed differences stem from what I believe are bugs in certain processes of Kraken2, which I have addressed. There's no other reason behind it.
The motivation for this project came from a friend who asked me to run Kraken2 on some data, but my computer's configuration wasn't sufficient. I also submitted an issue to the Kraken2 team, but it seemed to go unanswered. So, in a moment of impulse, I decided to develop my own tool.
Hi @eric9n, how do we mention the differences when preparing a manuscript/readme, or can you please guide me to the issue? 1 to 2 sentences should be enough.
Thanks,
Jianshu
Please refer to our paper for more details. We are currently preparing it.
I look forward to it!
Jianshu
Hi @eric9n,
I was able to produce a krona plot for results from kun_peng and Kraken2 for 2 real world dataset:
PacBio long metagenmic reads from human gut samples and Illumina shotgun metagenomic reads from oxygen minimum zone in the ocean (a significantly less studied system). I attached the html report results (unzip and open each file in our broswer). Overall I saw that Kun_peng is highly consistent with kraken2 and some small difference for some species (e.g., gDora in the min17 example, gCandidatus Pelagibacter in OMZ_S138 example) were also observed, with Kraken2 being higher in all those cases. What could be the reason behind this? Let me know if you want to add those results in readme as evidence to show that it is highly consistent with Kraken2.
Thanks,
Jianshu
banchmark_kun_peng.zip