ctmrbio / BACTpipe

BACTpipe: An assembly and annotation pipeline for bacterial genomics
https://bactpipe.readthedocs.org
MIT License
20 stars 7 forks source link

Time to reboot BACTpipe (version 3) #106

Closed boulund closed 3 years ago

boulund commented 5 years ago

I think it is time to review the BACTpipe workflow and update/replace some parts.

Looking for input @huyue87 and @thorellk!

I have the following ideas:

Please share any comments or ideas on how we can streamline and improve BACTpipe. I remember we've also talked about potentially including some MLST tools, genome assembly assessment tools (e.g. #103), or phage finding tools (#96).

thorellk commented 5 years ago

Replace mash screen with sendsketch.sh - Sounds good.

Replace BBDuk with fastp. - I never used fastp but your arguments sounds reasonable -> approved :)

License issue and SignalP-dependency of Prokka. - I think this sounds reasonable but have no experience in making Docker container myself so I don't know how much work this would be?

Update all main tools to the most recent versions. - No brainer

Assembly assessment tools I think could be nice. Bacteria are not mentioned at all in BUSCO, it seems to describe application to larger genomes. So we should just double check that this is proper, or look into alternatives.

I think we should stick to these updates first and then add more things if/when there is time + an actual, concrete and general (in most projects) need for it.

boulund commented 4 years ago

@huyue87 and I brainstormed a bit about potential improvements to BACTpipe in version 3. We considered rewriting BACTpipe to something like the following workflow graph.

In this suggestion we use:

The suggested outputs will be:

Open questions

bactpipe_v3

boulund commented 4 years ago

@thorellk @huyue87 and I recently had a meeting to plan for the work ahead. We decided to also add ResFinder to the suite of tools to be run. BACTpipe is really becoming a CGE workflow :laughing: ...

We also decided to split up development efforts into version 3 and 4. Version 3 will focus on getting the basic short read workflow in place, and version 4 will add support for long reads. We committed a few draft workflow graphs to docs/source/img on the develop branch.

valzip commented 4 years ago

Hi! I found your pipeline while looking for some program that would screen assembled contigs for contamination in my microbial pipeline. I am glad that my pipeline is looking roughly similar to yours. I can recommend you abricate from tseemann for virulence and resistance finding. It implements several databases, it is easy to run, runs fast, and in silico results are in agreement with our wet lab results.

boulund commented 4 years ago

Hi @valzip ! Thanks for sharing! We also love pretty much everything Torsten does :D. We've actually discussed abricate previously, and it's not at all impossible that we might decide to use abricate instead in the end.