Closed Ge0rges closed 1 year ago
This sounds like an issue with the HMMER version in the anvi'o environment (which should not have happened with the standard installation instructions). Would you be willing to install anvio-dev as explained in the installation page and test it again?
Hi Meren,
I can try install the development environment, but this is a shared server (and I am not the admin) so it may take a bit of time. For reference, the installed version of hmmer is HMMER 3.3.2
.
@meren I was able to install the dev environment with no errors. Same error, see below.
Traceback (most recent call last):
File "/Accounts/gkanaan/github/anvio/bin/anvi-run-hmms", line 143, in <module>
main(args)
File "/Accounts/gkanaan/github/anvio/anvio/terminal.py", line 881, in wrapper
program_method(*args, **kwargs)
File "/Accounts/gkanaan/github/anvio/bin/anvi-run-hmms", line 97, in main
search_tables.populate_search_tables(sources)
File "/Accounts/gkanaan/github/anvio/anvio/tables/hmmhits.py", line 277, in populate_search_tables
parser = parser_modules['search']['hmmer_table_output'](hmm_scan_hits_txt, alphabet=alphabet, context=context, program=self.hmm_program)
File "/Accounts/gkanaan/github/anvio/anvio/parsers/hmmer.py", line 582, in __init__
fixed_hmmer_table_txt = self.fix_sad_hmmer_table_output(hmmer_table_txt, col_names)
File "/Accounts/gkanaan/github/anvio/anvio/parsers/hmmer.py", line 754, in fix_sad_hmmer_table_output
hmmer_df.columns = col_names_plus_description_cols
File "/Accounts/gkanaan/.conda/envs/anvio-dev/lib/python3.7/site-packages/pandas/core/generic.py", line 5500, in __setattr__
return object.__setattr__(self, name, value)
File "pandas/_libs/properties.pyx", line 70, in pandas._libs.properties.AxisProperty.__set__
File "/Accounts/gkanaan/.conda/envs/anvio-dev/lib/python3.7/site-packages/pandas/core/generic.py", line 766, in _set_axis
self._mgr.set_axis(axis, labels)
File "/Accounts/gkanaan/.conda/envs/anvio-dev/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 216, in set_axis
self._validate_set_axis(axis, new_labels)
File "/Accounts/gkanaan/.conda/envs/anvio-dev/lib/python3.7/site-packages/pandas/core/internals/base.py", line 58, in _validate_set_axis
f"Length mismatch: Expected axis has {old_len} elements, new "
ValueError: Length mismatch: Expected axis has 16 elements, new values have 19 elements
Same command run. anti-self-test -v
output is:
Anvi'o .......................................: hope (v7.1-dev)
Profile database .............................: 38
Contigs database .............................: 20
Pan database .................................: 16
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 4
tRNA-seq database ............................: 2
Thanks for trying anvio-dev! I'll take a look into this in the next few hours.
Hey @Ge0rges, thank you very much for your patience with this. It turned out to be a serious issue that we didn't see coming :) Your HMM directory uses GENE:DNA
context. This is the first time someone is trying to run a DNA alphabet based model on coding genes. All examples so far run AA models or RNA models on genes, or DNA or RNA models on contigs, and never DNA models on genes. So the code was missing instructions to handle the output for that combination.
I think I fixed it in https://github.com/merenlab/anvio/commit/050b680300fb5fda2df666191b60ac084e63e8c0, and if you git pull
from anvio-dev you should be able to run it on your contigs-db no problem.
Thanks a lot again for sending a test dataset to figure this one out.
(If everything works please consider reporting back and closing the issue)
Hi @meren, your fix worked. Thanks for getting to it so promptly. I will close the issue with a minor final note that I think there's a typo in the docs of the parser file you edited here. Where it says GENE it should say CONTIG. Just to avoid future confusion :)
Good catch, thank you! Now fixed :)
And thanks for reporting back. I'm glad this is now resolved.
Short description of the problem
I think I've found a bug in the anvi-run-hmms, the software crashed. My best guess is that a bug in the way anvio parses HMMER results leads to the error detailed below.
anvi'o version
System info
Anvio is running on a research server that I did not setup, so I am unsure how exactly it was installed. However it exists in its own Conda environment and I assume it was properly installed. Here is the output of
name -a
:Linux 5.13.19-2-pve #1 SMP PVE 5.13.19-4 x86_64 x86_64 x86_64 GNU/Linux
Detailed description of the issue
I ran:
anvi-run-hmms -H anvio_psych_hmm/ -c psych_genomes.db -T 20 --just-do-it
and obtained:
Files to reproduce
https://www.dropbox.com/sh/4u2u25om3wh0wbp/AAAA11l9KWekr73xh79aNipHa?dl=0