Closed Nilad closed 3 years ago
Looks like you are using a non-standard installation of RepeatMasker ( image/container/wrapper ). It's really difficult for us to manage other developer's installations/modifications to our software. I usually recommend directing these kinds of requests directly to the individual(s) who packaged and distributed the software you are running.
The error is indicating that the installation of RepeatMasker is not complete. The configure script was never run and therefore the first invocation of RepeatMasker itself is trying to setup the RepeatMasker/Libraries directory. In this case you don't have write privileges to that directory. Once this is configured ( usually by an administrator if it's a system-wide installation ), this directory does not need to be written to anymore. RepeatMasker does create cached library files for each species at runtime, however once configured, if it can't write to the RepeatMasker/Libraries directory it will save it's cached files to the users ~/.RepeatMaskerCache directory. I would contact the author of the singularity package and request that they update the image by running configure before creating it.
A current downside to creating images using RepeatMasker is that for the time being the largest database of TEs is a closed ( license restricted ) one - RepBase. If you package up RepeatMasker as you must, without this database included, you will be requiring the user to finish the installation at a later time. They will have to download the library from GIRI and re-run the configure script. These actions modify the installation directory and will not work with static (read-only) installations.
I build myself the singularity image with the reference of this dockerhub page (mainly re-use by others users) https://hub.docker.com/r/robsyme/repeatmasker-onbuild
The configuration by perl ./configure
is problematic because it's need a user interaction. If this step can not be bypassed, how can i run this configuration without interaction ?
I just want use RepeatMasker with RepeatScout on my own data and dont use other reference librairies.
In the new version of RepeatMasker 4.0.9 the configure script now supports command-line parameters for all options. It still needs to be run in order to setup the Libraries/ directory even if you only plan to use custom libraries. That might work for you. We are starting to evaluate containerization/packaging technologies to add support for this type of installation method.
As for RepeatScout, just a word of caution. RepeatScout is tuned in such a way that it excels at finding young (less diverged) repeat families. I would recommend using it in tandem with RECON ( as we do in RepeatModeler ) to round out the range of families identified. Also, we have a new version of RepeatScout in development which can process genome-size samples, supports an affine gap model and custom scoring matrices for improved sensitivity. We hope to get that out this year.
Using the Biocontainers RepeatMasker image, one can create a subdirectory on the host at some path that is bind mounted in the container (e.g., the current working directory) containing symbolic links to the files in the RepeatMasker/Libraries directory (in the container), and set the (Bioconda/Biocontainers-RepeatMasker-specific) REPEATMASKER_LIB_DIR environment variable to this directory.
e.g.:
$ mkdir repeatmasker-libraries
$ singularity pull repeatmasker:4.0.9_p2--pl526_0.sif docker://quay.io/biocontainers/repeatmasker:4.0.9_p2--pl526_0.sif
...
$ singularity exec repeatmasker:4.0.9_p2--pl526_0.sif sh -c 'ln -s /usr/local/share/RepeatMasker/Libraries/* repeatmasker-libraries/'
$ ls -l repeatmasker-libraries/
total 0
lrwxrwxrwx 1 user group 54 May 10 11:51 Artefacts.embl -> /usr/local/share/RepeatMasker/Libraries/Artefacts.embl
lrwxrwxrwx 1 user group 49 May 10 11:51 Dfam.embl -> /usr/local/share/RepeatMasker/Libraries/Dfam.embl
lrwxrwxrwx 1 user group 48 May 10 11:51 Dfam.hmm -> /usr/local/share/RepeatMasker/Libraries/Dfam.hmm
...
$ REPEATMASKER_LIB_DIR=$PWD/repeatmasker-libraries singularity exec repeatmasker:4.0.9_p2--pl526_0.sif RepeatMasker -species human hsap_contig.fasta
RepeatMasker version open-4.0.9
Search Engine: NCBI/RMBLAST [ 2.6.0+ ]
Rebuilding RepeatMaskerLib.embl master library
- Read in 9 sequences from /scratch/nweeks/maker/data/repeatmasker-libraries/Artefacts.embl
- Read in 6235 sequences from /scratch/nweeks/maker/data/repeatmasker-libraries/Dfam.embl
RepeatMaskerLib.embl: 6244 total sequences.
Building FASTA version...Master RepeatMasker Database: /scratch/nweeks/maker/data/repeatmasker-libraries/RepeatMaskerLib.embl ( Complete Database: CONS-Dfam_3.0 )
...
Generating output...
masking
done
$ ls -l repeatmasker-libraries/RepeatMaskerLib.embl
-rw-r--r-- 1 user group 20552410 May 10 11:54 repeatmasker-libraries/RepeatMaskerLib.embl
It looks like this issue has been fixed in a more recent version of RepeatMasker and can be closed:
configure
program now accepts parameters at the command lineIf you encounter problems with these options, please file a new issue.
Hi!! I am implementing RepeatMasker4.1.1 While using the tool, I came across an error
Command: ./RepeatMasker -species fungi /home/guest1/assembly.fasta -dir /home/guest1/maker/RM_output
Output: The assumed RepeatMasker installation directory /home/guest1/maker/RepeatMasker does not appear to be correct. E.g it does not contain a 'Libraries' or 'Matrices' subdirectory. This can occur if hard links are used to invoke this script.
Kindly help. I shall be highly obliged.
Hi,
First, thanks for the maintenance of this tools.
I try to launch and work with RepeatMasker by a Singularity image.
I used RepeatScout before RepeatMasker (without issue :+1: )
But i have this problem:
Issue
Command line
singularity exec RepeatMasker.simg RepeatMasker -lib repeatscout.filtered myFasta.fasta