Closed CSU-KangHu closed 3 months ago
I haven’t figured out what names RepeatMasker takes or not takes. You should use the less picky buildSummary.pl script to generate the rm out summary.
Shujun
On Tue, Mar 5, 2024 at 9:07 PM Kang Hu @.***> wrote:
Hi @oushujun https://github.com/oushujun, Thank you for developing such a useful tool like EDTA.
While running EDTA, I encountered an issue that I had previously overlooked. The classification labels of the TE library outputted by EDTA do not match those of RepeatMasker. This discrepancy results in certain types of TEs being ignored when the TE library generated by EDTA is used as the input parameter (-lib) in RepeatMasker to generate the .tbl file.
For instance, EDTA's Helitron label is classified as DNA/Helitron, whereas RepeatMasker classifies it as RC/Helitron. As a result, the Rolling-circles line in the .tbl file shows 0 bp, and the proportion of DNA transposons is overestimated. Similarly, there are issues with other labels such as MITE, which seem to be classified as DNA labels for RepeatMasker to properly categorize them as DNA transposons. Could you please let me know if EDTA provides a script to convert the classification labels of the TE library to match those of RepeatMasker?
— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/442, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NE5DWNH6ECZGHOOFSTYWZ273AVCNFSM6AAAAABEIGMQCGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3TANBYGM4DOOI . You are receiving this because you were mentioned.Message ID: @.***>
Hi @oushujun, Thank you for developing such a useful tool like EDTA.
While running EDTA, I encountered an issue that I had previously overlooked. The classification labels of the TE library outputted by EDTA do not match those of RepeatMasker. This discrepancy results in certain types of TEs being ignored when the TE library generated by EDTA is used as the input parameter (
-lib
) in RepeatMasker to generate the.tbl
file.For instance, EDTA's Helitron label is classified as
DNA/Helitron
, whereas RepeatMasker classifies it asRC/Helitron
. As a result, theRolling-circles
line in the .tbl file shows0
bp, and the proportion ofDNA transposons
is overestimated. Similarly, there are issues with other labels such asMITE
, which seem to be classified asDNA
labels for RepeatMasker to properly categorize them asDNA transposons
. Could you please let me know if EDTA provides a script to convert the classification labels of the TE library to match those of RepeatMasker?