merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
413 stars 142 forks source link

Skip checking genome hashes not working #2202

Open lexikazen opened 5 months ago

lexikazen commented 5 months ago

Short description of the problem

Anvi'o told me that my internal genomes all have the same hash and won't accept the flag it suggested to use to overcome that.

anvi'o version

Anvi'o .......................................: marie (v8) Python .......................................: 3.10.13

Profile database .............................: 38 Contigs database .............................: 21 Pan database .................................: 16 Genome data storage ..........................: 7 Auxiliary data storage .......................: 2 Structure database ...........................: 2 Metabolic modules database ...................: 4 tRNA-seq database ............................: 2

anvi-self-test --version

System info

MacOS Sonoma anvi'o was installed using conda

Detailed description of the issue

I wanted to run a pangenome analysis of some of my bins generated from the metagenomics workflow. However, when I ran the command anvi-gen-genomes-storage -e Turicibacter_panalysis.txt -o bile-modifier-genomes.db I end up getting the error

Config Error: While working on your external genomes, anvi'o realized that genome
Turicibacter_sanguinis and Turicibacter_uncultured seem to have the same hash. If you are aware of this and/or if you would like anvi'o to not check genome
hashes, please use the flag --skip-checking-genome-hashes.

However, when I add that flag to the code, I get the error anvi-gen-genomes-storage: error: unrecognized arguments: --skip-checking-genome-hashes. They are all clearly different species and have different splits associated with them, so I'm not sure what to do now.

Files / commands to reproduce the issue

This is the file I used to tell anvio what bins to use... Turicibacter_panalysis.txt

meren commented 5 months ago

I'm afraid there is a problem with the internal genomes system.

While we hope to address this soon, you can continue your analysis by using the program anvi-split to get standalone contigs-db profile-db pairs for each of your bin, and use the external genomes file to start everything.