Closed marco91sol closed 2 years ago
RepeatMasker has a fairly strict classification nomenclature which makes it difficult to generate complete *.tbl files for custom libraries. For example, if your TEs are not named using the format "id#class/subclass" RepeatMasker will not recognize standard categories of classes. I would suggest you use the util/buildSummary.pl script and ignore everything but the per-family accounting provided. Then you can simply group them however you like to get summary statistics.
Describe the issue
I ran RepeatMasker using a mixture library deriving from denovo and transposable elements known in other species (created with DeepTE).
Attached, you can find the output of the table. Is the problem deriving from header names in the library? The output .tbl seems uncomplete, given that in .out file I find the following TE classes: ClassII_DNA_CACTA_MITE __ClassII_DNA_Harbinger_MITE ClassII_DNA_Harbinger_nMITE ClassII_DNA_Mutator_MITE __ClassII_DNA_Mutator_nMITE ClassII_DNA_PiggyBac_nMITE ClassIII_Helitron ClassI_LTR_BEL ClassI_nLTR ClassI_nLTR_LINE ClassI_nLTR_LINE_I __ClassI_nLTR_LINE_Jockey ClassI_nLTR_LINE_R2 __ClassI_nLTR_PLE ...and so on!
In my customized library, I have two different header type:
Euro_eel_AZBK01S000145.1_16703#__ClassI_nLTR_LINE_L1 #customized library
TE_00016804_INT#__ClassI_nLTR_LINE_Jockey #denovo library
Thanks for the support!
Best regards, Marco
Reproduction steps
Log output
Please paste or attach any and all log output, which includes useful information including data file statistics and version numbers. An easy way to capture this is to redirect the log output to a file e.g
RepeatMasker myseq.fa >& output.log
Environment (please include as much of the following information as you can find out):
How did you install RepeatMasker? e.g. manual installation from repeatmasker.org, bioconda, the Dfam TE Tools container, or as part of another bioinformatics tool?
Which version of RepeatMasker do you have? The output of
RepeatMasker -v
can be used to find this.Have you installed RepBase RepeatMasker Edition, or the full Dfam database?
Operating system and version. The output of
uname -a
andlsb_release -a
can be used to find this.Additional context