merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
439 stars 145 forks source link

[BUG] the file hash for Pfam-A.hmm.gz doesn't match to the hash we expected #1963

Open kafker opened 2 years ago

kafker commented 2 years ago

Short description of the problem

Hello there,

anvi-setup-pfams fail with Config Error

anvi-setup-pfams --pfam-data-dir $SCRATCH/pfam
* No Pfam version specified. Using current release.
Found Pfam version ...........................: 35.0 (2021-11)
Database URL .................................: http://ftp.ebi.ac.uk/pub/databases/Pfam/current_release
Downloaded successfully ......................: /g100_scratch/userexternal/mcappel1/pfam/Pfam-A.hmm.gz
Downloaded successfully ......................: /g100_scratch/userexternal/mcappel1/pfam/Pfam.version.gz
Downloaded successfully ......................: /g100_scratch/userexternal/mcappel1/pfam/Pfam-A.clans.tsv.gz

Config Error: Please re-run setup with --reset, the file hash for Pfam-A.hmm.gz doesn't match
              to the hash we expected. If you continue to get this error after doing that, try
              removing the entire Pfams data directory
              (/#####/#####/#####/pfam) manually and running setup again
              (without the --reset flag).

Additionally, using the --reset flag will not help because pfam directory is not empty

anvi-setup-pfams --pfam-data-dir $SCRATCH/ --reset

Config Error: You are attempting to run Pfam setup on a non-default data directory
              (/g100_scratch/userexternal/mcappel1/) using the --reset flag. To avoid
              automatically deleting a directory that may be important to you, anvi'o refuses
              to reset directories that have been specified with --pfam-data-dir. If you
              really want to get rid of this directory and regenerate it with Pfam data
              inside, then please remove the directory yourself using a command like `rm -r
              /g100_scratch/userexternal/mcappel1/`. We are sorry to make you go through this
              extra trouble, but it really is the safest way to handle things. 

anvi'o version

Replace this text with the output of this command:

anvi-self-test --version
Anvi'o .......................................: hope (v7.1)

Profile database .............................: 38
Contigs database .............................: 20
Pan database .................................: 15
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 2
tRNA-seq database ............................: 2

System info

Operating System: CentOS Linux 8

Anvio was installed via conda

meren commented 2 years ago

Additionally, using the --reset flag will not help because pfam directory is not empty

In your first command you use --pfam-data-dir $SCRATCH/pfam, and in your second, --pfam-data-dir $SCRATCH/ for the directory to be cleaned. It is a blessing that anvi'o doesn't erase that directory and instead says:

Config Error: You are attempting to run Pfam setup on a non-default data directory (/g100_scratch/userexternal/mcappel1/) using the --reset flag. To avoid automatically deleting a directory that may be important to you, anvi'o refuses to reset directories that have been specified with --pfam-data-dir.

The best solution here is this:

rm -rf  $SCRATCH/pfam
anvi-setup-pfams --pfam-data-dir $SCRATCH/pfam

Apart from that, I can reproduce this problem. But I'm afraid it is the PFAM server is being funky today. Not only the server connection is quite slow here, but it seems that it cuts connections. I was able to download the file Pfam-A.hmm.gz from http://ftp.ebi.ac.uk/pub/databases/Pfam/current_release manually using my browser, move it into my test directory from the terminal, and then test anvi-setup-pfams, which worked properly then :/

meren commented 2 years ago

I'll keep this open in case others experience this, too.

Sorry about the inconvenience.