Open zyndagj opened 5 years ago
While RepeatMasker does have an option to output gff format, it excludes all class and family information.
##gff-version 2 ##date 2019-04-05 ##sequence-region Arabidopsis_thaliana.TAIR10.dna.toplevel.fa 1 RepeatMasker similarity 1 107 13.2 - . Target "Motif:ATREP18" 561 649 1 RepeatMasker similarity 1066 1097 10.0 + . Target "Motif:(C)n" 1 32 1 RepeatMasker similarity 1155 1187 17.1 + . Target "Motif:(TTTCTT)n" 1 33
To keep this information, I need to convert
SW perc perc perc query position in query matching repeat position in repeat score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID 282 13.2 0.0 8.1 1 1 107 (30427564) C ATREP18 DNA (1142) 649 561 1 22 10.0 0.0 0.0 1 1066 1097 (30426574) + (C)n Simple_repeat 1 32 (0) 2 15 17.1 0.0 0.0 1 1155 1187 (30426484) + (TTTCTT)n Simple_repeat 1 33 (0) 3 231 10.9 0.0 0.0 1 4285 4330 (30423341) C MuDR-16_ALy DNA/MULE-MuDR (3083) 934 889 4
to GFF3 format, where I track
in additional metadata fields
While RepeatMasker does have an option to output gff format, it excludes all class and family information.
To keep this information, I need to convert
to GFF3 format, where I track
in additional metadata fields