Closed darencard closed 4 months ago
Please disregard my earlier report, as I have solved the issue. Here are more details in case it is helpful to anyone else.
I had a hunch that something was failing silently and that appears to be correct. I was running an interactive job on the HPC to perform this installation. Upon terminating that job, I received the following message.
slurmstepd: error: Detected 3 oom_kill events in StepId=31843435.0. Some of the step tasks have been OOM Killed.
This suggested to me that some process during the installation was failing because it was running out of memory. The obvious culprit is the building of RepeatMaskerLib.h5
.
To confirm this, I started a new interactive job with 1 GB of memory (previously, I had used 500 MB). I then re-ran the installation as above, which worked properly this time. Therefore, it appears that at least 1 GB of memory is needed to build the libraries properly.
perl ./configure
-- Setting perl interpreter...
Can't open DateRepeats: No such file or directory.
RepeatMasker Configuration Program
Checking for libraries...
Rebuilding RepeatMaskerLib.h5 master library
- Read in 49011 sequences from /home/dac9979/repeat_bin/RepeatMasker/Libraries/RMRBSeqs.embl
- Read in 49011 annotations from /home/dac9979/repeat_bin/RepeatMasker/Libraries/RMRBMeta.embl
Merging Dfam + RepBase into RepeatMaskerLib.h5 library..........................................................
File: /home/dac9979/repeat_bin/RepeatMasker/Libraries/RepeatMaskerLib.h5
Database: Dfam withRBRM
Version: 3.6
Date: 2022-04-12
Dfam - A database of transposable element (TE) sequence alignments and HMMs.
RBRM - RepBase RepeatMasker Edition - version 20181026
Total consensus sequences: 63852
Total HMMs: 18987
.
/usr/bin/which: no trf409.linux64 in (/n/cluster/bin:/opt/singularity/bin:/n/cluster/bin:/opt/singularity/bin:/home/dac9979/miniforge3/bin:/home/dac9979/miniforge3/condabin:/n/cluster/bin:/opt/singularity/bin:/usr/local/bin:/usr/bin:/opt/puppetlabs/bin:/usr/local/rvm/bin:/usr/local/sbin:/usr/sbin:/home/dac9979/.local/bin:/home/dac9979/bin)
The full path including the name for the TRF program.
TRF_PRGM: /home/dac9979/repeat_bin/trf409.linux64
Add a Search Engine:
1. Crossmatch: [ Un-configured ]
2. RMBlast: [ Un-configured ]
3. HMMER3.1 & DFAM: [ Un-configured ]
4. ABBlast: [ Un-configured ]
5. Done
Enter Selection: 2
/usr/bin/which: no rmblastn in (/n/cluster/bin:/opt/singularity/bin:/n/cluster/bin:/opt/singularity/bin:/home/dac9979/miniforge3/bin:/home/dac9979/miniforge3/condabin:/n/cluster/bin:/opt/singularity/bin:/usr/local/bin:/usr/bin:/opt/puppetlabs/bin:/usr/local/rvm/bin:/usr/local/sbin:/usr/sbin:/home/dac9979/.local/bin:/home/dac9979/bin)
The path to the installation of the RMBLAST sequence alignment program.
RMBLAST_DIR [/home/dac9979/repeat_bin/rmblast-2.11.0/bin]: /home/dac9979/repeat_bin/rmblast-2.11.0/bin
Do you want RMBlast to be your default
search engine for Repeatmasker? (Y/N) [ Y ]: y
Add a Search Engine:
1. Crossmatch: [ Un-configured ]
2. RMBlast: [ Configured, Default ]
3. HMMER3.1 & DFAM: [ Un-configured ]
4. ABBlast: [ Un-configured ]
5. Done
Enter Selection: 5
Building FASTA version of RepeatMasker.lib .............................................
Building RMBlast frozen libraries..
The program is installed with a the following repeat libraries:
File: /home/dac9979/repeat_bin/RepeatMasker/Libraries/RepeatMaskerLib.h5
Database: Dfam withRBRM
Version: 3.6
Date: 2022-04-12
Dfam - A database of transposable element (TE) sequence alignments and HMMs.
RBRM - RepBase RepeatMasker Edition - version 20181026
Total consensus sequences: 63852
Total HMMs: 18987
Further documentation on the program may be found here:
/home/dac9979/repeat_bin/RepeatMasker/repeatmasker.help
Describe the issue
During installation, the software is unable to find RepeatMaskerLib.h5.
I have installed RepeatMasker many times in the past and have never encountered this problem. In this instance, I am trying to install an order version of RepeatMasker (v. 4.1.4) and prerequisite software to analyze some new data in the same way as an old analysis I previously completed. This installation is on a HPC where I have not used RepeatMasker before, so perhaps that is contributing to the problem.
Thanks in advance for help addressing this issue!
Reproduction steps
I also manually copied over Repbase (RepBaseRepeatMaskerEdition-20181026.tar.gz) after downloading it. Then I was able to unpack the Repbase release.
Not applicable.
Log output
Once the software and database was preconfigured, I tried installing RepeatMasker with
perl ./configure
. This took me through the installation wizard, which I specified the paths to TRF and RMBlast. However, as you can see in the full output below, at some point the fileRepeatMaskerLib.h5
is not being found. It appears that RepeatMasker is trying to build this file but fails, perhaps silently, at some stage, and then it can't findRepeatMaskerLib.h5
when it is needed. It's also strange that I am able to proceed further with the installation and the final output suggests that everything is installed correctly (although it lists no databases), but the error gives me pause and suggests that may not be the case. I have not tried running the software on a genome given this apparent issue.Environment (please include as much of the following information as you can find out):
manual installation from repeatmasker.org
RepeatMasker -v
can be used to find this.RepeatMasker v. 4.1.4
Yes, I installed Repbase (RepBaseRepeatMaskerEdition-20181026.tar.gz) - see above.
uname -a
andlsb_release -a
can be used to find this.Additional context
This problem has only occurred for me on this HPC. Have installed RepeatMasker countless times in the past on a range of systems.