We identified a bug in RepeatMasker 4.1.1 which affects classifications from RepeatClassifier that are based on similarity to other known elements. RepeatMasker's configure program generates a file named RepeatMasker.lib in its Libraries directory which is used by RepeatClassifier; however in RepeatMasker 4.1.1 this file will be missing the required classification data. This bug will be fixed in the next RepeatMasker release.
What programs are impacted by this bug?
This issue affects classifications from RepeatClassifier that are based on similarity to other known elements, causing them to be classified as "Unknown" instead. Classifications that are based on similarity to known protein sequences (RepeatPeps.lib) are unaffected by this bug. This bug only affects the RepeatClassifier program, which is part of the RepeatModeler package. It does not affect other programs, including RepeatMasker or RepeatModeler themselves.
Am I affected by this bug?
If you are using RepeatClassifier (or RepeatModeler, which runs this program as its last step) and have configured RepeatModeler to use RepeatMasker 4.1.1, you are probably affected. You can confirm if this affects you by inspecting the beginning of RepeatMasker.lib by hand:
Example incorrect file (missing classification):
$ head -n1 /path/to/RepeatMasker/Libraries/RepeatMasker.lib
>ACRO1_ @Primates [S:50]
Example correct file (showing a classification #Satellite/acromeric):
$ head -n1 /path/to/RepeatMasker/Libraries/RepeatMasker.lib
>ACRO1_#Satellite/acromeric @Primates [S:50]
Suggested solutions
1) You can install RepeatMasker 4.1.2 or later, in which this bug has been fixed.
2) You can install a copy of an older version of RepeatMasker (such as 4.1.0) and configure RepeatModeler to use that installation of RepeatMasker instead of RepeatMasker 4.1.1.
3) You can manually regenerate the file RepeatMasker.lib with the necessary classification data:
After applying either workaround and confirming that it has fixed the file (see above), you can re-run only RepeatClassifier without re-running all of RepeatModeler to reclassify results:
This will reclassify sequences according to the new (fixed) RepeatMasker.lib file and overwrite the files yourgenome-families.fa.classified and yourgenome-families-classified.stk.
Cause of the issue
We identified a bug in RepeatMasker 4.1.1 which affects classifications from
RepeatClassifier
that are based on similarity to other known elements. RepeatMasker'sconfigure
program generates a file namedRepeatMasker.lib
in itsLibraries
directory which is used byRepeatClassifier
; however in RepeatMasker 4.1.1 this file will be missing the required classification data. This bug will be fixed in the next RepeatMasker release.What programs are impacted by this bug?
This issue affects classifications from
RepeatClassifier
that are based on similarity to other known elements, causing them to be classified as "Unknown" instead. Classifications that are based on similarity to known protein sequences (RepeatPeps.lib
) are unaffected by this bug. This bug only affects theRepeatClassifier
program, which is part of the RepeatModeler package. It does not affect other programs, includingRepeatMasker
orRepeatModeler
themselves.Am I affected by this bug?
If you are using
RepeatClassifier
(orRepeatModeler
, which runs this program as its last step) and have configured RepeatModeler to use RepeatMasker 4.1.1, you are probably affected. You can confirm if this affects you by inspecting the beginning ofRepeatMasker.lib
by hand:Example incorrect file (missing classification):
Example correct file (showing a classification
#Satellite/acromeric
):Suggested solutions
1) You can install RepeatMasker 4.1.2 or later, in which this bug has been fixed.
2) You can install a copy of an older version of RepeatMasker (such as 4.1.0) and configure
RepeatModeler
to use that installation of RepeatMasker instead of RepeatMasker 4.1.1.3) You can manually regenerate the file
RepeatMasker.lib
with the necessary classification data:After applying either workaround and confirming that it has fixed the file (see above), you can re-run only
RepeatClassifier
without re-running all ofRepeatModeler
to reclassify results:This will reclassify sequences according to the new (fixed)
RepeatMasker.lib
file and overwrite the filesyourgenome-families.fa.classified
andyourgenome-families-classified.stk
.