With multiple runs of RepeatMasker 4.1.5 on my genome, I noticed that when masking with RepBase libraries, Penelopes are reported at 0.00%, and with my own, classified de novo library, Penelopes are at 1%. This also depends on whether I label them as "#PLE" or as "#LINE/Penelope" (as it is labeled in the RepBase database). With the "#LINE/Penelope" classification, 0% Penelopes are reported and LINEs are increased. I inferred that this classification leads to Penelopes being counted under "LINEs" and not "Penelopes".
My question is whether my conclusion is right, and whether I need to reclassify the RepBase library (installed for RepeatMasker) so that all the elements classified "#LINE/Penelope" are changed to "PLE", to ensure correct annotation in the tbl file.
The .tbl file is just one way to tabulate the results and as you found is quite opinionated about it. I would suggest using the utility "RepeatMasker/util/buildSummary.pl " to process your .out file. This will give per-family stats so that you can tabulate them as you like.
Hello,
With multiple runs of RepeatMasker 4.1.5 on my genome, I noticed that when masking with RepBase libraries, Penelopes are reported at 0.00%, and with my own, classified de novo library, Penelopes are at 1%. This also depends on whether I label them as "#PLE" or as "#LINE/Penelope" (as it is labeled in the RepBase database). With the "#LINE/Penelope" classification, 0% Penelopes are reported and LINEs are increased. I inferred that this classification leads to Penelopes being counted under "LINEs" and not "Penelopes".
My question is whether my conclusion is right, and whether I need to reclassify the RepBase library (installed for RepeatMasker) so that all the elements classified "#LINE/Penelope" are changed to "PLE", to ensure correct annotation in the tbl file.
Thank you