merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
439 stars 145 forks source link

anvi-script-FASTA-to-contigs-db tries to search old hmms #415

Closed AstrobioMike closed 7 years ago

AstrobioMike commented 7 years ago

This script fails and says it couldn't find the old hmm sets:

Config Error: Each search database directory must contain following files: 'kind.txt',
'reference.txt', 'genes.txt', 'target.txt', and 'genes.hmm.gz'. Dupont_et_al does not seem to be a proper source.

meren commented 7 years ago

Hi Mike,

Thanks for the report, but I think this is a problem due to your configuration (in other words, you probably you screwed up your installation while trying to help me test the master :)). I really feel responsible for this, and apologize for it.

In fact, it should never happen on a properly configured system. I present you with the evidence from a computer that is running from the master:

meren ~/github/anvio/tests/sandbox $ anvi-script-FASTA-to-contigs-db contigs.fa

:: INPUT DIR: /Users/meren/github/anvio/tests/sandbox, FNAME: contigs ...

:: RENAMING CONTIGS ...

Input ........................................: /Users/meren/github/anvio/tests/sandbox/contigs.fa
Output .......................................: /Users/meren/github/anvio/tests/sandbox/contigs-clean.fa
Minimum length ...............................: 0
Total num contigs ............................: 6
Total num nucleotides ........................: 57,030
Contigs removed ..............................: 0 (0.00% of all)
Nucleotides removed ..........................: 0 (0.00% of all)
Deflines simplified ..........................: True

:: GENERATING THE CONTIGS DB ...

Finding ORFs in contigs
===============================================
Genes ........................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpLMpA6k/contigs.genes
Proteins .....................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpLMpA6k/contigs.proteins
Log file .....................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpLMpA6k/00_log.txt
Result .......................................: Prodigal (v2.6.3) has identified 51 genes.

Contigs with at least one gene call ..........: 6 of 6 (100.0%)
Contigs database .............................: A new database, /Users/meren/github/anvio/tests/sandbox/contigs.db, has been created.
Number of contigs ............................: 6
Number of splits .............................: 6
Total number of nucleotides ..................: 57,030
Gene calling step skipped ....................: False
Splits broke genes (non-mindful mode) ........: False
Desired split length (what the user wanted) ..: 20,000
Average split length (wnat anvi'o gave back) .: (Anvi'o did not create any splits)

:: RUNNING HMMs ...

HMM profiles .................................: 2 sources have been loaded: Rinke_et_al (162 genes, domain: archaea), Campbell_et_al (139 genes, domain: bacteria)
Target found .................................: AA:GENE
Auxiliary Data ...............................: Found: /Users/meren/github/anvio/tests/sandbox/contigs.h5 (v. 1)
Contigs DB ...................................: Initialized: /Users/meren/github/anvio/tests/sandbox/contigs.db (v. 7)
Sequences ....................................: 51 sequences reported.
FASTA ........................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpf7FfFJ/aa_gene_sequences.fa

HMM Profiling for Rinke_et_al
===============================================
Reference ....................................: Rinke et al, http://www.nature.com/nature/journal/v499/n7459/full/nature12352.html
Kind .........................................: singlecopy
Target .......................................: AA:GENE
Domain .......................................: archaea
Pfam model ...................................: /Users/meren/github/anvio/anvio/data/hmm/Rinke_et_al/genes.hmm.gz
Number of genes ..............................: 162
Number of CPUs will be used for search .......: 1
Temporary work dir ...........................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpdy9OsX
HMM scan output ..............................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpdy9OsX/hmm.output
HMM scan hits ................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpdy9OsX/hmm.hits
Log file .....................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpdy9OsX/00_log.txt
Number of raw hits ...........................: 1

HMM Profiling for Campbell_et_al
===============================================
Reference ....................................: Campbell et al, http://www.pnas.org/content/110/14/5540.short
Kind .........................................: singlecopy
Target .......................................: AA:GENE
Domain .......................................: bacteria
Pfam model ...................................: /Users/meren/github/anvio/anvio/data/hmm/Campbell_et_al/genes.hmm.gz
Number of genes ..............................: 139
Number of CPUs will be used for search .......: 1
Temporary work dir ...........................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpOWAmvK
HMM scan output ..............................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpOWAmvK/hmm.output
HMM scan hits ................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpOWAmvK/hmm.hits
Log file .....................................: /var/folders/9x/wzgz_dhx4x97t34z29fh8sjc0000gn/T/tmpOWAmvK/00_log.txt
Number of raw hits ...........................: 2
meren commented 7 years ago

I suggest you to remove all installed versions of anvi'o, including the remaining scripts in various bin dirs etc, and then reinstall it :) If it doesn't fix it, I will fly to LA and fix it myself.