metagenomics / metagenomics-tk

GNU Affero General Public License v3.0
0 stars 0 forks source link

Enable Binning Refinement for Production #335

Open pbelmann opened 1 year ago

pbelmann commented 1 year ago

Before allowing the bin refinement to be part of the production setting, the following changes must be done:

  1. The prodigal output is part of the annotation module. This is confusing since prokka also runs prodigal and subsequent modules are using the prokka output. Possible Solution: The prodigal output should be placed as part of the magscot output folder.

  2. Due to the usage of the collectFile operator of the binning module, all subsequent tools like annotation, magattributes and metabolomics are only executed once assembly, read mapping and binning of all provided samples are finished. The issue here is that especially assembly takes quite long and in the worst case just a few Megahit runs are blocking the computation of the remaining pipeline.

  3. Metabinner was so far not part of the production settings due to various issues. See possible fix #93

  4. Magscot edge cases are fixed #333

  5. While hmmsearch does not need much resources it still needs a lot of disk space. Possible Quick fix: Increase resource label to highmemmedium.