Open Wangray123 opened 7 hours ago
The score in the fourth column of the rice_MH63_repeat.bed file represents the length of the TE annotated in the current bin. The fifth column of the rice_MH63_nonTEgene.gff3 file contains the end base position of gene annotations. You can learn more about the BED and GFF3 file formats through the following links:
https://github.com/jianshu93/gfftobed/blob/main/README.md https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md TE annotation files can be obtained using EDTA or RepeatMasker, and you can use bedtools for statistical analysis or convert formats using gfftobed (https://github.com/jianshu93/gfftobed). Gene annotations can be obtained through homology mapping or de novo annotation. You can also learn more about the input file formats for GenomeSyn through the following link: https://github.com/banzhou59/GenomeSyn/blob/main/GenomeSyn-1.2.7/README
Hello,
I am trying to prepare a data file for visualization using this software, but I don't understand what the score values in the 4th column of your example file (rice_MH63_repeat.bed) mean, as well as the values in the 5th column of the gene annotation file (rice_MH63_nonTEgene.gff3).
Could you please explain them to me? Also, what kind of command should I input to obtain this type of file? Could you please provide me with the code to obtain these two files?
I look forward to your reply. Thank you!