I have merged all the current backlog of pull requests into develop_lee as they are piling up.
Features Added:
Raise a value error when no contigs pass the coverage test in rule split_circular_and_linear_contigs. Fixes https://github.com/rotary-genomics/rotary/issues/199 where no circular or linear contigs lists are written, which causes a workflow error.
Update the QC before-and-after report code to handle KBP (highly filtered reads) and GBP (many Illumina reads) values. These values commonly break the old version of the code. Convert results with these units to MBP - See https://github.com/rotary-genomics/rotary/pull/201 for details.
In the rule search_contig_start, add the -p meta flag to run prodigal in meta mode. The prodigal doesn't scan the genome in this mode to build the gene-calling model. Instead, it predicts genes using prebuilt models, which is suitable for metagenomics. Prodigal requires at least 20K bases to make a model, which is larger than many circular plasmids. In some cases, Prodigal would fail when the only circular element is a small plasmid because the rest of the genome is fragmented and the plasmid is shorter than 20K bases. @jmtsuji Do you think running the -p meta flag should be sufficient if we are doing a start gene search?
Remove end_repair assemblies from assembly stats as they are most often identical to the flye assembly stats.
Aggregate checkm stats from all samples into a single file for easier viewing.
Add freebayes explicitly to the pyploca.yml install to prevent an install failure.
Snakemake 8.6.0 introduced a bug that prevents the deletion of temp files created before a checkpoint, causing storage issues due to accumulating gigabytes of temp files, which I reported in snakemake/snakemake#2982. Downgrade to version 7 to temporarily fix this issue. Details: https://github.com/rotary-genomics/rotary/pull/216. I was originally going to go back to snakemake version 8.5.5 but I found that this version also causes some stability issues (I think it was before version 8's initial release), so I decided to downgrade to version 7. Details: https://github.com/rotary-genomics/rotary/pull/217
Modify rule set_up_sample_directories so that each sample directory is generated dynamically and remove the sample tsv as input. This allows the sample TSV to be modified without causing the entire pipeline to be run all over again from scratch.
I have merged all the current backlog of pull requests into develop_lee as they are piling up.
Features Added:
split_circular_and_linear_contigs.
Fixes https://github.com/rotary-genomics/rotary/issues/199 where no circular or linear contigs lists are written, which causes a workflow error.search_contig_start
, add the-p meta
flag to run prodigal in meta mode. The prodigal doesn't scan the genome in this mode to build the gene-calling model. Instead, it predicts genes using prebuilt models, which is suitable for metagenomics. Prodigal requires at least 20K bases to make a model, which is larger than many circular plasmids. In some cases, Prodigal would fail when the only circular element is a small plasmid because the rest of the genome is fragmented and the plasmid is shorter than 20K bases. @jmtsuji Do you think running the-p meta
flag should be sufficient if we are doing a start gene search?pyploca.yml
install to prevent an install failure.set_up_sample_directories
so that each sample directory is generated dynamically and remove the sample tsv as input. This allows the sample TSV to be modified without causing the entire pipeline to be run all over again from scratch.