EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
307 stars 69 forks source link

makehmmerdb fails with input/ouput error #234

Closed RivasLab closed 2 years ago

RivasLab commented 3 years ago

this is a follow up on issue https://github.com/EddyRivasLab/hmmer/issues/168

We are using hmmer 3.3.2, which is a version after the previous makehmmerdb issue was closed.

We are trying to use makehmmerdb for a large dataset of vertebrate genomes (~600GB).

This is the error we get after ~24h running /var/slurmd/spool/slurmd/job22844537/slurm_script: line 14: /n/holylfs03/LABS/eddy_lab/Users/maoaoyue/hmmer-3.3.2/src/makehmmerdb: Input/output error

This is the script we used in the Odyssey cluster

!/bin/bash

SBATCH -n 1 # Number of cores requested

SBATCH -N 1 # Ensure that all cores are on one machine

SBATCH -t 3-00:00 # Runtime in minutes

SBATCH -p eddy # Partition to submit to

SBATCH --mem=16000 # Memory per cpu in MB (see also --mem-per-cpu)

SBATCH --open-mode=append

SBATCH -o %j.out # Standard out goes to this file

SBATCH -e %j.err # Standard err goes to this filehostname

DB=/n/holylfs03/LABS/eddy_lab/home/wgao/vertebrates/vertebrate_genomes.fasta BF=/n/holylfs03/LABS/eddy_lab/Users/maoaoyue/vg.hmmerdb /n/holylfs03/LABS/eddy_lab/Users/maoaoyue/hmmer-3.3.2/src/makehmmerdb $DB $BF

traviswheeler commented 3 years ago

Looking back through the history of this issue (and the related notes near the end of #168), I believe the error reported by @RivasLab took one of these forms, depending on input: buildAndWriteFMIndex: Error writing T in FM index. buildAndWriteFMIndex: Error writing BWT in FM index. buildAndWriteFMIndex: Error writing SA in FM index.

These are all the sort of error you'd get if you've run out of disk on the temporary directory. Near the end of the conversation in #168, we made a guess that the problem was related to the location of the temporary directory: if you're on a cluster with massive network drive but small local disk, and /tmp is mounted on the local disk, you'd likely get an out-of-disk error for such a large sequence input. @npcarter suggested setting $TMPDIR to point to a large filesystems (/n/eddyfs03 or /n/holylfs03). Did this overcome the problem?

npcarter commented 3 years ago

Closing this issue because we believe it has been addressed as best as can be. Please open a new issue if the proposed $TMPDIR fix doesn't solve the problem.