merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
440 stars 145 forks source link

Error during RNA gene calls #623

Closed SilasK closed 7 years ago

SilasK commented 7 years ago

I installed anivo3 via conda and run

anvi-run-hmms on a database of binned contigs. I would expect 1 genome with 70%, it should contain 16S.

HMM Profiling for Ribosomal_RNAs
===============================================
Reference ....................................: Seeman T, https://github.com/tseemann/barrnap
Kind .........................................: Ribosomal_RNAs
Alphabet .....................................: RNA
Context ......................................: CONTIG
Domain .......................................: N\A
HMM model path ...............................: /home/cegg/kieser/anaconda3/envs/anvio3/lib/python3.5/site-packages/anvio/data/hmm/Ribosomal_RNAs/genes.hmm.gz
Number of genes ..............................: 12
Number of CPUs will be used for search .......: 1
Temporary work dir ...........................: /tmp/tmpryjqv_5m
HMM scan output ..............................: /tmp/tmpryjqv_5m/hmm.output
HMM scan hits ................................: /tmp/tmpryjqv_5m/hmm.hits
Log file .....................................: /tmp/tmpryjqv_5m/00_log.txt
Number of raw hits ...........................: 4                                                                   

Psst. Your fancy HMM profile 'Ribosomal_RNAs' speaking
===============================================
Alright! You just called an HMM profile that runs on contigs. Because it is not
working with anvi'o gene calls directly, the resulting hits will need to be
added as 'new gene calls' into the contigs database. This is a new feature, and
if it starts screwing things up for you please let us know. Other than that
you're pretty much golden. Carry on.

Contigs with at least one gene call ..........: 1056 of 1057 (99.9%)                                                
Gene calls added to db .......................: 4 (from source "Ribosomal_RNAs")                                    
Contigs database .............................: An existing database, CR5.006.db, has been initiated.
Number of contigs ............................: 1,057
Number of splits .............................: 1,737
Total number of nucleotides ..................: 2,766,684
Split length .................................: 1,000
Error in library(gtools) : there is no package called 'gtools'
meren commented 7 years ago

Dear Silas,

This output makes no sense to me :/ The first part seems to be coming from anvi-run-hmms, and the very last line,

Error in library(gtools) : there is no package called 'gtools'

only appears in anvi-script-gen_stats_for_single_copy_genes.R, which is not called by anvi-run-hmms.

Can you please send a complete list of commands and outputs?

You may see what Ribosomal RNAs identified in your contigs database by running this command and inspecting the output FASTA File:

anvi-get-sequences-for-hmm-hits -c CR5.006.db \
                                --hmm-source Ribosomal_RNAs \
                                -o Ribosomal_RNAs.fa
SilasK commented 7 years ago

True I was using the anvi-script-gen_stats_for_single_copy_genes.R just after anvi-run-hmms

Here is my full code.

source activate anvio3

Creating the contigs database

fasta=CR5.006.fasta db=CR5.006.db

anvi-gen-contigs-database --contigs-fasta $fasta \                              --project-name "Find Genome of 331720" \                            -o $db \                            -L 1000

  anvi-display-contigs-stats contigs.db

Next, I populate the single-copy gene hit tables in this newly generated contigs database:

anvi-run-hmms -c $db anvi-script-gen_stats_for_single_copy_genes.py $db anvi-script-gen_stats_for_single_copy_genes.R ${db}.hits ${db}.genes

meren commented 7 years ago

Thanks, Silas.

If you type R in your terminal and run this, the error due to missing packages will most likely go away:

install.packages(c('gtools', 'gridExtra', 'ggplot2', 'optparse'))

But this script is now outdated, and encapsulated by @ozcan's latest addition to anvi'o, anvi-display-contigs-stats. If there is no particular reason for you to use the R script, I would suggest you to utilize this new program.

Best,