spacegraphcats / 2018-paper-spacegraphcats

Paper text and pipeline for "Exploring neighborhoods in large metagenome assembly graphs..."
https://biorxiv.org/content/early/2018/11/05/462788
4 stars 1 forks source link

update checkm rule to output tab separated table #11

Open taylorreiter opened 5 years ago

taylorreiter commented 5 years ago

The files in paper/figures/files_checkm/ are annoyingly difficult to parse. As per https://github.com/Ecogenomics/CheckM/issues/29, there is an option --tab_table that will produce a tab separated file without the fancy formatting for easy parsing. I think this can be added in these lines of code:

https://github.com/spacegraphcats/2018-paper-spacegraphcats/blob/master/pipeline-base/Snakefile#L572 https://github.com/spacegraphcats/2018-paper-spacegraphcats/blob/master/pipeline-base/Snakefile#L584 https://github.com/spacegraphcats/2018-paper-spacegraphcats/blob/master/pipeline-base/Snakefile#L596 https://github.com/spacegraphcats/2018-paper-spacegraphcats/blob/master/pipeline-base/Snakefile#L607

Which will then look like this:

rm -fr checkm.plass.bins && mkdir checkm.plass.bins && ln {input} checkm.plass.bins && checkm lineage_wf -x fa checkm.plass.bins checkm.plass.out -t {threads} --genes --pplacer_threads={threads} --tab_table -f checkm-plass.txt
taylorreiter commented 5 years ago

I should note i have not tested this