donovan-h-parks / RefineM

A toolbox for improving metagenome-assembled genomes.
GNU General Public License v3.0
63 stars 9 forks source link

Outliers: Ignore "cov_perc" parameter #13

Closed evensannesriiser closed 6 years ago

evensannesriiser commented 6 years ago

Hi Donovan,

Thanks for both CheckM and RefineM, both very interesting tools! :)

Is it possible to fully "disable" the outlier identification based on --cov_perc? That is, make the command ignore that parameter? In some cases, I would like to filter bins based on outliers defined by diverging GC% or tetra signatures only.

Kind regards,

Even S. Riiser, PhD candidate, University of Oslo, Norway

donovan-h-parks commented 6 years ago

Hey Even. You can do this indirectly be setting --cov_corr to -2 and --cov_perc 1000000000. The correlation will always be above -1 and you are highly, highly unlikely to see a mean absolute percent error above this value.

evensannesriiser commented 6 years ago

Hi Donovan,

Thanks, but when I run the following command,

refinem outliers $REFINEM_MAXBIN_OUT/scaffold_stats.tsv $REFINEM_MAXBIN_OUT/outliers_cov_perc_ignored --gc_perc 98 --td_perc 98 --cov_corr -2 --cov_perc 1000000000 --individual_plots

I get the following error message:

refinem outliers: error: argument --cov_perc: invalid choice: 1000000000 (choose from -1, 0, 1, 2, 3, 4, (...) 997, 998, 999, 1000)

Seems like I'm not allowed to choose such a high cov_perc number...

Even

donovan-h-parks commented 6 years ago

This will be fixed in v0.0.21 which I will be releasing in the next hour.