merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
439 stars 145 forks source link

--min-num-bins-gene-occurs doesn't work for external genomes in anvi-get-sequences-for-hmm-hits #1060

Closed ShaiberAlon closed 5 years ago

ShaiberAlon commented 5 years ago

When using --min-num-bins-gene-occurs in anvi-get-sequences-for-hmm-hits with external genomes then you get this message:

Config Error: OK. Well. This is awkward. You have like 1 bins, eh? And you are asking anvi'o
              to remove any that occurs in less than 5 bins. Do you see the problem here?
              Maybe it is time to take a break from work :(

They way I think it should work is:

  1. replace --min-num-bins-gene-occurs to --min-num-genomes-gene-occurs (more compatible with the names --external-genomes and --internal-genomes
  2. when this parameter is specified then the total number of internal + external genomes should be considered. For example if there are two internal genomes and one external genome in the analysis. And the hmm is in the external genome and also in ONE of the internal genomes. AND --min-num-genomes-gene-occurs is 2. Then anvi'o should be happy and give us the output sequences.

@meren, what do you think?

meren commented 5 years ago

Both suggestions make sense.

For the (1), we need to make sure the parameter name change is reflected on any relevant web material.

For (2), would you like to do it or would you prefer me to do it? :)

Best,

ShaiberAlon commented 5 years ago

I would prefer that you do t if you don't mind. Especially because #1061 makes me worry that there are more hidden issues with this module.

meren commented 5 years ago

I see. I am looking at it now.

meren commented 5 years ago

Can you try again after cf874b8673160fd9709087076e1db2d36da8dde3? :)

ShaiberAlon commented 5 years ago

This still doesn't work.

meren commented 5 years ago

I wanted to see whether closing was going to fix it :(

ShaiberAlon commented 5 years ago

You should open an issue with the GitHub people :-)

meren commented 5 years ago

Brilliant.

meren commented 5 years ago

This must be closing this: a8a2f153da9d3a9f1fb3eac730e5878477a44fda

ShaiberAlon commented 5 years ago

Nope...

If you want to easily reproduce my issue, then checkout the branch metapan-workflow, and then run ./run_pangenomics_workflow_tests.sh (in the anvio test folder).

meren commented 5 years ago

I tried and couldn't reproduce this using the script ./run_pangenomics_workflow_tests.sh. I assume it is somehow fixed :(