KThorellGroup / BACTpipe

BACTpipe: An assembly and annotation pipeline for bacterial genomics
https://bactpipe.readthedocs.org
MIT License
20 stars 8 forks source link

Replace SendSketch with Kraken2 #162

Closed boulund closed 3 years ago

boulund commented 3 years ago

@abhi18av @emilio-r @thorellk I spoke to Kaisa earlier today about our frustrations with SendSketch being so unreliable in the pipeline, so we decided we should skip it entirely. It brings more issues than it solves at the moment.

Looking around what other bacterial whole genome assembly pipelines use for taxonomic classification it seems Kraken2 is the most common. Reviewing our options I feel we have the following choices:

Summarizing the pros and cons I can see off the top of my head:

@thorellk and I now think that the best course of action ahead would be to replace SendSketch with Kraken2 and make the database a user-specified parameter at runtime. If the user does not specify a Kraken2 database, that step is skipped and the prokka step proceeds without genus information.

I'm planning to spend some time to implement Kraken2 as a replacement for SendSketch on Friday morning, and I will base my work on your latest branch @abhi18av , abhinav/check_signalp, so it's easy to merge everything back to develop without conflicts later.

abhi18av commented 3 years ago

Thanks for the update @boulund 👍

I am happy that we have decided to move beyond sendsketch since the it doesn't really make the workflow reproducible at all.

boulund commented 3 years ago

Implemented and tested in the dev branch