merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
440 stars 145 forks source link

anvi-interactive -C complaining of HMMs #552

Closed jmeppley closed 7 years ago

jmeppley commented 7 years ago

We are trying to view contig bins in anvio with "-C CONCOCT", but getting this message:

(/slipstream/home/bethanie/.conda/envs/anvio) bethanie@moleculardb:~/anvio_June14_2017$ anvi-interactive -c /slipstream/home/bethanie/anvio_June14_2017/anvio/contigs.db -p /slipstream/home/bethanie/anvio_June14_2017/anvio/samples-all-merged/PROFILE.db --taxonomic-level t_family  --port 1047 -s anvio/samples.all.db -C CONCOCT
Auxiliary Data ...............................: Found: /slipstream/home/bethanie/anvio_June14_2017/anvio/contigs.h5 (v. 1)
Contigs DB ...................................: Initialized: /slipstream/home/bethanie/anvio_June14_2017/anvio/contigs.db (v. 8)
Taxonomy .....................................: Initiated for taxonomic level for "t_family"
Mode .........................................: collection

Config Error: HMM's were not run for this contigs database :/

Short version: This is a weird data set made up of viral contigs. We did anvi-run-hmms, but I don't think any of the core HMMS are matching anything in our contigs. I'm not understanding why this would prevent the --collection_name view from working, though.

More details: The HMMs were run, and you can see COG annotations when you inspect a contig:

screen shot 2017-07-03 at 9 55 41 am

However, there are no completeness or duplication percentages if you load the CONCOCT bins from within Anvi'o (just "--" where the values should be).

I re-ran the run-hmms script and get this output:

(/slipstream/home/bethanie/.conda/envs/anvio) bethanie@moleculardb:~/anvio_June14_2017$ anvi-run-hmms -c anvio/contigs.db
HMM profiles .................................: 2 sources have been loaded: Rinke_et_al (162 genes, domain: archaea), Campbell_et_al (139 genes, domain: bacteria)
Target found .................................: AA:GENE
Auxiliary Data ...............................: Found: anvio/contigs.h5 (v. 1)
Contigs DB ...................................: Initialized: anvio/contigs.db (v. 8)
Sequences ....................................: 1468 sequences reported.
FASTA ........................................: /tmp/tmpcfm1j4/aa_gene_sequences.fa

HMM Profiling for Rinke_et_al
===============================================
Reference ....................................: Rinke et al, http://www.nature.com/nature/journal/v499/n7459/full/nature12352.html
Kind .........................................: singlecopy
Target .......................................: AA:GENE
Domain .......................................: archaea
Pfam model ...................................: /slipstream/home/bethanie/.conda/envs/anvio/lib/python2.7/site-packages/anvio/data/hmm/Rinke_et_al/genes.hmm.gz
Number of genes ..............................: 162
Number of CPUs will be used for search .......: 1
Temporary work dir ...........................: /tmp/tmpnHeqrL
HMM scan output ..............................: /tmp/tmpnHeqrL/hmm.output
HMM scan hits ................................: /tmp/tmpnHeqrL/hmm.hits
Log file .....................................: /tmp/tmpnHeqrL/00_log.txt
Number of raw hits ...........................: 0

HMM Profiling for Campbell_et_al
===============================================
Reference ....................................: Campbell et al, http://www.pnas.org/content/110/14/5540.short
Kind .........................................: singlecopy
Target .......................................: AA:GENE
Domain .......................................: bacteria
Pfam model ...................................: /slipstream/home/bethanie/.conda/envs/anvio/lib/python2.7/site-packages/anvio/data/hmm/Campbell_et_al/genes.hmm.gz
Number of genes ..............................: 139
Number of CPUs will be used for search .......: 1
Temporary work dir ...........................: /tmp/tmpwz_rnJ
HMM scan output ..............................: /tmp/tmpwz_rnJ/hmm.output
HMM scan hits ................................: /tmp/tmpwz_rnJ/hmm.hits
Log file .....................................: /tmp/tmpwz_rnJ/00_log.txt
Number of raw hits ...........................: 0
meren commented 7 years ago

Hi John,

The first one seems to be a proper bug :) Collection mode should work even if there are no HMMs. I will look into this ASAP.

On the other hand when I look at the HMMs output, there seems to be no hits (Number of raw hits: 0), so it seems to me it is not surprising to not see any completion / redundancy scores in the interface. I apologize if I'm missing something.

Best,

jmeppley commented 7 years ago

Thanks. The second was not a bug. I was just trying to provide more background information.

On Jul 3, 2017 6:35 PM, "A. Murat Eren" notifications@github.com wrote:

Hi John,

The first one seems to be a proper bug :) Collection mode should work even if there are no HMMs. I will look into this ASAP.

On the other hand when I look at the HMMs output, there seems to be no hits (Number of raw hits: 0), so it seems to me it is not surprising to not see any completion / redundancy scores in the interface. I apologize if I'm missing something.

Best,

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/merenlab/anvio/issues/552#issuecomment-312759917, or mute the thread https://github.com/notifications/unsubscribe-auth/ABo37FgxWCH_rN0Pgak8CzZSHiZFjZoaks5sKZbGgaJpZM4OMhHL .

meren commented 7 years ago

Hi John,

This behavior should be fixed now :) Please let me know if you get a chance to test it.

Thank you very much!

Best,

jmeppley commented 7 years ago

I just tried it with the latest code and it does work. Thanks!

meren commented 7 years ago

Perfect! Thank you very much for confirming.