merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
439 stars 145 forks source link

Misidentified common functions during anvi-gen-genomes-storage #475

Closed meren closed 7 years ago

meren commented 7 years ago

This happens when at least one external or internal genome is missing functional annotations:

$ anvi-gen-genomes-storage -e external-genomes.txt -i internal-genomes.txt -o T-GENOMES.h5

WARNING
===============================================
Good news! Anvi'o found all these functions that are common to all of your
genomes and will use them for downstream analyses and is very proud of you:
'SMART, TIGRFAM, Pfam, Gene3D, Coils, Hamap, ProSitePatterns, ProSiteProfiles,
PRINTS, PIRSF, SUPERFAMILY'.

Internal genomes .............................: 1 have been initialized.
External genomes .............................: 3 found.

* INTERNAL is stored with 23 genes (1 of which were partial)

Config Error: Some of the functional sources you requested are missing from the contigs
              database '/Users/meren/github/anvio/tests/sandbox/test-output/pan_test/01.db'.
              Here they are (or here it is, whatever): 'SMART', 'TIGRFAM', 'Pfam', 'Gene3D',
              'Coils', 'Hamap', 'ProSitePatterns', 'ProSiteProfiles', 'PRINTS', 'PIRSF',
              'SUPERFAMILY'.

If even a single genome is missing functions, functions should not be considered for any of them.

meren commented 7 years ago

OK. The same command now gives this warning, and continues happily. HOW NICE.

$ anvi-gen-genomes-storage -e external-genomes.txt -i internal-genomes.txt -o T-GENOMES.h5

WARNING
===============================================
Some of your genomes (3 of the 4, to be precise) seem to have no functional
annotation. Since this workflow can only use matching functional annotations
across all genomes involved, having even one genome without any functions means
that there will be no matching function across all. Things will continue to
work, but you will have no functions at the end for your protein clusters.

Internal genomes .............................: 1 have been initialized.
External genomes .............................: 3 found.

* INTERNAL is stored with 23 genes (1 of which were partial)
* g01 is stored with 119 genes (0 of which were partial)
* g02 is stored with 119 genes (0 of which were partial)
* g03 is stored with 117 genes (1 of which were partial)

The new genomes storage ......................: T-GENOMES.h5 (signature: 24a904f3)
Number of genomes ............................: 4 (internal: 1, external: 3)
Number of gene calls .........................: 378
Number of partial gene calls .................: 2