Open RajvParvathaneni opened 2 years ago
Hi Rajiv,
I was wondering if you found a solution for this? I'm having a similar problem of running repeatmodeler2 on a cluster with a genome of 1.4Gb. Specifically, round 5 has an estimated duration of >200 hours, longer than my cluster will allow me to run jobs on. There appear to be a few other issues that have been raised with people having the same issue.
In #158, @jebrosen appeared to resolve the problem by including export BLAST_USAGE_REPORT=false
. This hasn't resolved the problem for me, unfortunately.
Any advice would be greatly appreciated.
Kind regards,
Christophe
Were either of you able to fix this? I am also running into this issue running on a cluster with a genome size of 1.1 Gb. The estimated time for round 6 is around 400 hours. Would also appreciate any advice or insight.
Thank you,
Cinnamon
For me, using repeatmodeler installed with conda significantly reduces run time compared to using repeatmodeler that has been installed manually. These were run on the same university system, on the same genome fasta file and using same random seed. Not sure what the reasoning is behind this but would love to know.
Describe the issue
I am running RepeatModeler using the university cluster with my fungal genome (~46 MB). I used the following syntax below. It completed 2 rounds and still running. How long does it take? Or do I need to modify my syntax to work faster. Any guidance is appreciated.
BuildDatabase -name E2 -engine ncbi xxx.fa RepeatModeler -engine ncbi -pa 3 -database E2 > E2-repeat.out
A concise description of the bug, including any error messages.
Reproduction steps
Log output
Please paste or attach any and all log output, which includes useful information including data file statistics and version numbers. An easy way to capture this is to redirect the log output to a file e.g
RepeatModeler -database mydb >& output.log
. The log output should include the "random seed" value at the start of the run. This number will be necessary in order to reproduce the run exactly.Environment (please include as much of the following information as you can find out):
How did you install RepeatModeler? e.g. manual installation from repeatmasker.org, bioconda, the Dfam TE Tools container, or as part of another bioinformatics tool?
Which version of RepeatModeler do you have? The output of
RepeatModeler
without any options will be a help page with the version of the program displayed at the top.Which version of RepeatMasker is this RepeatModeler installation using? Have you installed RepBase RepeatMasker Edition for RepeatMasker, or the full Dfam database?
Operating system and version. The output of
uname -a
andlsb_release -a
can be used to find this.Additional context