microbiomedata / mg_annotation

Metagenome Annotation Workflow
4 stars 7 forks source link

Add outputs and adjust split size #7

Open scanon opened 2 years ago

scanon commented 2 years ago

This adds several missing outputs from the structural annotation portion of the pipeline (crt, genemark, prodigal, trnascan, rfam). These outputs were being used in the full outputs but the individual GFF files weren't being propagated as outputs.

This also adjusts the split size to 100M. Testing found that the memory requirements and run-times are still pretty reasonable at this split size. This significantly improves throughput since most of the steps run in the same amount of time for the larger split size.

aclum commented 5 months ago

@scanon okay if I close this PR as obsolete? The block size of 100 is already in the master branch and reconciling missing output files was addressed in December 2022.