vlothec / TRASH

RepeatIdentifier
MIT License
50 stars 3 forks source link

R visualization #5

Closed duhuipeng closed 1 year ago

duhuipeng commented 1 year ago

I ran through your first step using the fa file under example_run These are all the files it generated,as follow: image

But now I want to use R for visualization, and I don't know if there is something wrong with the file I entered The code I run is as follows: Rscript TRASH_run.R ../example_run/all.repeats.from.CP068268_39050443_39150442.fa.csv made the following error: image now I want to draw a picture of your article, as follow: image How do I do that? Could you please provide the running code to draw these diagrams? As the result of the example_run run, could you provide the R running code to run this example? Best HuipengDu

vlothec commented 1 year ago

Hello, TRASH is not able to produce these published images, as they're results of a much longer analysis of multiple genomes, where TRASH repeat identification was just the first step.

By default, a circos plot is drawn to visualise repeat locations. Alternatively, after adding --simpleplot to the run command, for each sequence a plot will be drawn with repeat locations on the x axis and their sizes on the y axis.

Panel c plotting is plot(x = repeats$start, y = repeats$HOR_score, type = "h", col = "blue", xlim = c(cen_start, cen_end) Panel b plot is done similarily, using data of CEN178 repeats from 330 centromeres, averaged over their lengths in 100 bins. Label formatting for both has been done in Photoshop. Please email me at pw457@cam.ac.uk for details, as it's not relevant to this GitHub repository.