Open mictadlo opened 5 years ago
I'm getting the same error - does this have to do with RepBase making their software proprietary?
Hi, The error comes from Maker trying to use the RepBase version bundled with RepeatMasker. However, this RepBase version is not packaged in bioconda, see https://github.com/bioconda/bioconda-recipes/blob/master/recipes/repeatmasker/build.sh#L16 = the one in the default repeatmasker distribution was too old, and we couldn't ship a newer one due to the RepBase license
The solution is to download RepBase manually, and set the REPEATMASKER_LIB_DIR and REPEATMASKER_MATRICES_DIR environment variables.
in what file do we set the variables?
@xonq : you can set the environment variables in your shell (script) before invoking maker
; e.g.:
export REPEATMASKER_LIB_DIR=/path/to/my/repeatmasker/lib
can this be resolved by installing a different repeatmasker version? i.e. conda install maker repeatmasker=4.0.7
edit: this does not work
Hi, Does anyone know where I can download or retrieve the MATRICES? Thanks!
You have to get a license for the program and install.
Hi @abretaud I tried to follow the solutions you provided but I still encounter a similar issue.
Here are what I have tried:
I found the Maker bioconda package seems to already have the libraries, so I downloaded the updated library "RepBaseRepeatMaskerEdition-20181026.tar.gz" from RepBase and then I uncompressed it at my Bioconda environment directory: /sd/MAKER_py2/share/RepeatMasker/Libraries
Set up the environment variable for the library: export REPEATMASKER_LIB_DIR=sd/MAKER_py2/share/RepeatMasker/Libraries
I found the Maker bioconda package also seems to already have the Matrices, so I set up the environment variable for the matrices as: export REPEATMASKER_MATRICES_DIR=sd/MAKER_py2/share/RepeatMasker/Matrices I echoed both variables and they look correct.
However, when I tried the Maker, it shows:
maker -h Possible precedence issue with control flow operator at /sd/MAKER_py2/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805. MAKER version 2.31.9
Could you help see which step might go wrong? Thank you so much.
I'm having similar issues. This post by Maker's author suggests that the conda version of Maker might not be properly working. http://gmod.827538.n3.nabble.com/Does-Conda-Maker-actually-work-td4060214.html#a4060215
@gbdias : that post was a while ago; the inline C issue should have been resolved: https://github.com/bioconda/bioconda-recipes/pull/15001
@phhsieh1329: the warning is harmless, and is fixed in newer versions of bioperl: https://github.com/bioperl/bioperl-live/pull/251 (MAKER is pinned to bioperl 1.7.2, since bioperl 1.7.3 removed many modules, which were separated into different distributions)
@nathanweeks Thanks for the information. In this case, if you happen to know whether I should download and update the local repeat database with the latest one from RepBase? Thanks.
@phhsieh1329 : I guess it depends on whether or not you need RepBase (and have the $$ to pay for a version). RepeatMasker is bundled with Dfam.
Hi, i avoided this error by running "$ RepeatMasker ./configure" from within the environment conda installed maker in
Please excuse my lack of knowledge. I'm a total newb. I'm about to run MAKER on a de novo assembly. My institution doesn't have a REPBASE license. Does MAKER call on the REPBASE website? or does the MAKER install include a REPBASE database that it uses to mask repeats? I want to know if I'm doomed to fail without a license before I spent the time and money to engage the super computer that will do the processing.
Thanks, and please ask followup questions. I'll muddle my way through them.
MAKER2 requires that you have a license for and install RepBase. It does not install RepBase for you.
BTW at the risk of false positives, the NCBI Eukaryotic Genome Pipeline uses Windowmasker (installed alongside BLAST) as an alternative to RepeatMasker: https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/
Ok, thanks. Now, can I have MAKER call WindowMasker, or do I run Windowmasker on the genome first and then feed it into MAKER? If the former, how do I have MAKER do that?
According to This Reply, MASKER checks xxx/RepeatMasker/Libraries/RepeatMaskerLib.embl
for Repbase.
However, RepeatMasker no more creates RepeatMaskerLib.embl
, instead it uses Dfam.h5
to create RepeatMaskerLib.h5
. Even though you may download old version of Repbase somewhere, (e.g. RMRBSeqs.embl
), RM will only add it into RepeatMaskerLib.h5
, thus it won't help if you set REPEATMASKER_LIB_DIR
.
BTW, RM uses LIBDIR
instead of REPEATMASKER_LIB_DIR
in newer versions.
So there are two ways to solve this:
model_org=
empty in maker_opts.ctl
, and it won't check if Repbase was installed.RMRBSeqs.embl
, create symlink in xxx/RepeatMasker/Libraries/
. For example, ln -s RMRBSeqs.embl RepeatMaskerLib.embl
. But in that case, model_org=
should be set to the org that exists in the database, instead of all
. CAUTION: I don't know whether the result is reliable enough in this way.Last but not least, for the sake that Repbase now provides repeat_db in fasta format, if you have newer version of db, just provide it by setting rmlib=xxx.fa
in MASKER config.
I also met this problem, the problem can be resolved. Firstly, you need use the command line $which -a RepeatMasker, if the information show that ~/anaconda3/bin/RepeatMasker, this may be the source of the problem. Y'd better install RepeatMasker software manually. Meanwhile, you need download the Repbase database and decompress it in the RepeatMasker working directory in order to update the library files. Finally, you need change the RepeatMasker software path in the file of maker_exe.ctl. Then maker will be working correctly.
replace the repeatmasker 2018 into 2017, also their repbases. problem solved. by the way, it seems that maker3 performed more accuracy then maker2
This error comes from line 4363 in GI.pm in maker library where maker is trying to get the library path from the absolute path of RepeatMasker software.
Here is the code in GI.pm
my $exe = Cwd::abs_path($CTL_OPT{RepeatMasker});
my ($lib) = $exe =~ /(.*\/)RepeatMasker$/;
$lib .= "Libraries/RepeatMaskerLib.embl";
Maker will breakdown and error is printed if $lib
is empty. And I found RepeatMaskerLib.embl is missing from my repeatmasker library directory. So, as @Yorks0n said, you can set model_org equal to empty or create a softlink, and i think it should be works if you change the source code to let maker find RMRBSeqs.embl instead of RepeatMaskerLib.embl.
$lib .= "Libraries/RMRBSeqs.embl";
According to This Reply, MASKER checks
xxx/RepeatMasker/Libraries/RepeatMaskerLib.embl
for Repbase.However, RepeatMasker no more creates
RepeatMaskerLib.embl
, instead it usesDfam.h5
to createRepeatMaskerLib.h5
. Even though you may download old version of Repbase somewhere, (e.g.RMRBSeqs.embl
), RM will only add it intoRepeatMaskerLib.h5
, thus it won't help if you setREPEATMASKER_LIB_DIR
. BTW, RM usesLIBDIR
instead ofREPEATMASKER_LIB_DIR
in newer versions.So there are two ways to solve this:
- Just set
model_org=
empty inmaker_opts.ctl
, and it won't check if Repbase was installed.- If you have older versions of Repbase like
RMRBSeqs.embl
, create symlink inxxx/RepeatMasker/Libraries/
. For example,ln -s RMRBSeqs.embl RepeatMaskerLib.embl
. But in that case,model_org=
should be set to the org that exists in the database, instead ofall
. CAUTION: I don't know whether the result is reliable enough in this way.Last but not least, for the sake that Repbase now provides repeat_db in fasta format, if you have newer version of db, just provide it by setting
rmlib=xxx.fa
in MASKER config.
This worked for me
Hi, I'm fairly new to bioinformatics and is currently trying to use MAKER to annotate my assembly.
I've currently installed MAKER v3.01.03 using bioconda and so far everything runs smoothly following this tutorial with model_org=
set to empty.
Please correct me if I'm wrong but setting model_org=
to empty would mean that the entire step of repeat masking would be skipped, yes? That is what is written within the maker_opts.ctl
file, but I would like to run repeat masking.
I don't have an older version of RepBase either so the symlink method doesn't seem to apply to me. I don't have a subscription for it either.
I've seen that Dfam can be used instead in #26529 but I haven't been able to find a method to instruct MAKER to use Dfam?
Similar to https://github.com/bioconda/bioconda-recipes/issues/16501#issuecomment-1308307968, I do not have RepeatMaskerLib.embl in my Libraries folder as well.
It seems that manually tweaking the source code https://github.com/bioconda/bioconda-recipes/issues/25559#issuecomment-738756514 here is required for MAKER to recognize RepeatMaskerLib.h5
which is created when RepeatMasker ./configure
is ran as I understand it?
Is there any other workaround to resolve this error?
Hi! I think Dfam is the way to go now rather than the non-free RepBase. You can also try running RepeatModeler to create a repeat library specific to the genome you want to annotate (then give it to RepeatMasker to use the library). Not sure if the code tweaking still works, but it seems like a good option. You can also run RepeatMasker on your own and give the output to maker a pre-masked genome sequence.
Hi not sure if this helps,
Just wanted to update that I've managed to workaround ERROR: Could not determine if RepBase is installed
by installing h5py (I think the python version in the env has to be >3.8) with conda in the same environment as MAKER then export LIBDIR=/path/to/conda/environment/share/RepeatMasker/Libraries
and queried famdb.py /path/to/the/maker/environment/share/RepeatMasker/famdb.py lineage -d all > ATextFile.txt
then specify what I need for model_org=
in the maker_opts.ctl file
.
So far trying out with the example dataset from MAKER seems to work out with model_org=Alca
. Haven't tried it on my own files though
Hi @abretaud, @nathanweeks, @johanneskoester, @kastman, @pvanheus, @jerowe, @bgruening and @ArneKr,
I ran Maker but I got the following error:
Where do I install RepBase with this package?
Thank you in advance,
Michal