Closed josiahseaman closed 8 years ago
I'm still experimenting with what is the easiest to read. I tweaked the rules so that a set of columns still reads left to right across the screen and only goes to a new row (major disjunction) after 100 columns.
Here's a sample of the new layout I'm experimenting with. Each row is 10 megabases of Bonobo Chromosome 22.
Same as above but with a column width of 171bp. Notice features still land in the same place on the screen despite different widths.
For consistency sake I had originally planned on the mega-rows to stack down indefinitely. But that would make human ch1 a 1:3 aspect ratio, which is not a terribly good fit for a 16:9 screen. So I think instead we need one more level of structure. "Tiles" will be square sections of 10 mega-rows that will stack horizontally, left to right. Chr1 would be 2.5 tiles.
I'm going to make the margins bigger.
This issue needs one last image: I decreased the layout by a factor of ten so I could see the largest scale tile layout without processing a 30GB+ file. Here's the result.
I think I toned down the margin growth rate after see this image too.
I was taking another look at DDV and specifically thinking about what it would look like to display the whole genome, without color compression using DDV style zoom. I noticed that the chromosomes get really long and a single column goes off the end of the screen. Ironically, this is what Skittle, then DDV was originally trying to solve by breaking the line and wrapping it around, keeping it on screen so you can see everything.
So I've come up with a layout that takes Skittle and DDV to the logical conclusion, it tiles the view recursively (fractally?) so that no matter what the same size, it's always vaguely squarish and will fit on the screen. It ensures that sequences that are close to each other will always be close in the visualization. It's basically introducing "page breaks" to DDV.
I've made the breaks multiples of 10 so it's also easy to keep track of how big things are. Each line is 100bp. Each DDV column is 1,000 lines long so 100,000 bp. Each set of ten columns makes a tile that is exactly one megabase, so those are easy to count. The you have a super-column of 10 megabases, and for large chromosomes like Chr1, you have pages of 100 MB which are easy to count.
Zoomed out:
This should be particularly nice when we start stacking 3 chromosomes together for comparison. Chromosome 1 is 247,249,719 bp so three of them interleaved would be almost a gigabase to visualize. With this approach we can visualize as much as we want.