Open EricDeveaud opened 2 years ago
This is indeed a problem. DateRepeats is quite an old tool and may need some modifications in order to make it work with the new *.h5 database format. I will let you know if I can find a quick workaround.
DateRepeats 4.1.2 is also failing at UCSC Genome Browser building our hg38 patch 14. We use it to strip out the human specific repeats.
I added the famdbfile setting to DateRepeats so it does not complain about famdbfile path not found: my $tax = Taxonomy->new( taxonomyDataFile => $taxFile, famdbfile => "$dir/RepeatMaskerLib.h5");
However, it runs for more than 27 hours using CPU the whole time until I killed it.
With RM version 4.1.0, all the small patch chromosomes finished in just about one minute each.
Please let me know if it would be handy to supply the commandline and input file for testing.
Hanging command is: DateRepeats chr5_MU273352v1_fix.txt -query human -comp 'mus musculus'
Thanks Galt. I removed DateRepeats in the latest version (4.1.4) as it needs refactoring. I will make sure this is a high priority for the next release.
Describe the issue
RepeatMaskerLib.embl is not built while configuring RepeatMasker-4.1.2-p1 and is requestrd by DateRepeats
Reproduction steps
Log output
BUT !
and
no RepeatMasker.embl required by DateRepeats
Environment (please include as much of the following information as you can find out):
How did you install RepeatMasker? manual installation from repeatmasker.org from tar.gz archive
Which version of RepeatMasker do you have?
Operating system and version. The output of
uname -a
andlsb_release -a
can be used to find this.Additional context version 4.1.0 previously installed works as expected.