arq5x / poretools

a toolkit for working with Oxford nanopore data
MIT License
243 stars 89 forks source link

Occupancy plot #26

Open nickloman opened 10 years ago

nickloman commented 10 years ago

To show which pores active and when

arq5x commented 10 years ago

Ha we were just discussing the same thing.

nickloman commented 10 years ago

I made a start here: https://github.com/arq5x/poretools/blob/master/poretools/occupancy.py

nickloman commented 10 years ago

woops!

arq5x commented 10 years ago

Nice. Would be interesting to visualize this using circle heatmaps as below (first answer methinks): http://stackoverflow.com/questions/13983225/plot-plate-layout-heatmap-in-r

Would love to organize the flow according to the flowcell layout such that bubbles or other structural effects are made apparent.

Ideas:

  1. Color of circle is on a gradient to depict fraction of run time that it was active?
  2. (optional) Size depicts log10 of the total yield of the pore?
nickloman commented 10 years ago

Ah OK, I was also thinking about a time-series style plot a la: image

The other would be a flow cell style heatmap.

It seems like the pore IDs are arranged slightly oddly (but regularly) judged from the MinKNOW screen.

No idea if there is any run-to-run variation in this?

nickloman commented 10 years ago

Should also spit out some stats, e.g. total channels seen, % occupancy (over time)

arq5x commented 10 years ago

This is a useful visualization as well. Perhaps we could offer two viz. options? I am happy to hack the heatmap style. They appear to use a reproducible 4 row wrapping algorithm for the layout. Upper right is the origin. The first block of 128 pores is found in the first 4 rows by 32 columns, the next 128 in the next 4 x 32, etc.

image

nickloman commented 10 years ago

If you wanna take this on I'd be more than happy!

arq5x commented 10 years ago

Will do. By the time you post the first method, I should be able to free up some fun time to work on this.

arq5x commented 10 years ago

First pass of the ggplot code. Need to solve the layout issue:

library(ggplot2)
set.seed (123)
platelay <- data.frame (rown = rep (letters[1:16], 32), coln = rep (1:32, each = 16),
                    colorvar = rnorm (512, 0.3, 0.2))

ggplot(platelay, aes(y = factor(rown, rev(levels(rown))),x = factor(coln))) + 
  geom_point(aes(colour = colorvar), size =12)  +theme_bw() +
  geom_text(aes(label = 1:512), colour="white") +
  labs(x=NULL, y = NULL)

rplot

arq5x commented 10 years ago

Boom. Layout unlocked. rplot01

Code:

# generate from the following Python code:
# seeds = [125,121,117,113,109,105,101,97,93,89,85,81,77,73,69,65,61,57,53,49,45,41,37,33,29,25,21,17,13,9,5,1]
#for s in seeds:
#  for block in range(4):
#      for row in range(4):
#          print str(s + 128*block + row) + ",",
labels = c(125, 126, 127, 128, 253, 254, 255, 256, 381, 382, 383, 384, 509, 510, 511, 512, 
           121, 122, 123, 124, 249, 250, 251, 252, 377, 378, 379, 380, 505, 506, 507, 508, 
           117, 118, 119, 120, 245, 246, 247, 248, 373, 374, 375, 376, 501, 502, 503, 504, 
           113, 114, 115, 116, 241, 242, 243, 244, 369, 370, 371, 372, 497, 498, 499, 500, 
           109, 110, 111, 112, 237, 238, 239, 240, 365, 366, 367, 368, 493, 494, 495, 496, 
           105, 106, 107, 108, 233, 234, 235, 236, 361, 362, 363, 364, 489, 490, 491, 492, 
           101, 102, 103, 104, 229, 230, 231, 232, 357, 358, 359, 360, 485, 486, 487, 488, 
           97, 98, 99, 100, 225, 226, 227, 228, 353, 354, 355, 356, 481, 482, 483, 484, 
           93, 94, 95, 96, 221, 222, 223, 224, 349, 350, 351, 352, 477, 478, 479, 480, 
           89, 90, 91, 92, 217, 218, 219, 220, 345, 346, 347, 348, 473, 474, 475, 476, 
           85, 86, 87, 88, 213, 214, 215, 216, 341, 342, 343, 344, 469, 470, 471, 472, 
           81, 82, 83, 84, 209, 210, 211, 212, 337, 338, 339, 340, 465, 466, 467, 468, 
           77, 78, 79, 80, 205, 206, 207, 208, 333, 334, 335, 336, 461, 462, 463, 464, 
           73, 74, 75, 76, 201, 202, 203, 204, 329, 330, 331, 332, 457, 458, 459, 460, 
           69, 70, 71, 72, 197, 198, 199, 200, 325, 326, 327, 328, 453, 454, 455, 456, 
           65, 66, 67, 68, 193, 194, 195, 196, 321, 322, 323, 324, 449, 450, 451, 452, 
           61, 62, 63, 64, 189, 190, 191, 192, 317, 318, 319, 320, 445, 446, 447, 448, 
           57, 58, 59, 60, 185, 186, 187, 188, 313, 314, 315, 316, 441, 442, 443, 444, 
           53, 54, 55, 56, 181, 182, 183, 184, 309, 310, 311, 312, 437, 438, 439, 440, 
           49, 50, 51, 52, 177, 178, 179, 180, 305, 306, 307, 308, 433, 434, 435, 436, 
           45, 46, 47, 48, 173, 174, 175, 176, 301, 302, 303, 304, 429, 430, 431, 432, 
           41, 42, 43, 44, 169, 170, 171, 172, 297, 298, 299, 300, 425, 426, 427, 428, 
           37, 38, 39, 40, 165, 166, 167, 168, 293, 294, 295, 296, 421, 422, 423, 424, 
           33, 34, 35, 36, 161, 162, 163, 164, 289, 290, 291, 292, 417, 418, 419, 420, 
           29, 30, 31, 32, 157, 158, 159, 160, 285, 286, 287, 288, 413, 414, 415, 416, 
           25, 26, 27, 28, 153, 154, 155, 156, 281, 282, 283, 284, 409, 410, 411, 412, 
           21, 22, 23, 24, 149, 150, 151, 152, 277, 278, 279, 280, 405, 406, 407, 408, 
           17, 18, 19, 20, 145, 146, 147, 148, 273, 274, 275, 276, 401, 402, 403, 404, 
           13, 14, 15, 16, 141, 142, 143, 144, 269, 270, 271, 272, 397, 398, 399, 400, 
           9, 10, 11, 12, 137, 138, 139, 140, 265, 266, 267, 268, 393, 394, 395, 396, 
           5, 6, 7, 8, 133, 134, 135, 136, 261, 262, 263, 264, 389, 390, 391, 392, 
           1, 2, 3, 4, 129, 130, 131, 132, 257, 258, 259, 260, 385, 386, 387, 388)

library(ggplot2)
set.seed (123)
platelay <- data.frame (rown = rep (letters[1:16], 32), coln = rep (1:32, each = 16),
                        colorvar = rnorm (512, 0.3, 0.2))

ggplot(platelay, aes(y = factor(rown, rev(levels(rown))),x = factor(coln))) + 
  geom_point(aes(colour = colorvar), size =12)  +theme_bw() +
  geom_text(aes(label = labels), colour="white") +
  labs(x=NULL, y = NULL)
nickloman commented 10 years ago

schweeet

arq5x commented 10 years ago

Would be neat to use color to depict occupied versus unoccupied and make a time lapse movie of these plots over time

timp0 commented 10 years ago

Also - a suggestion would be to add channel to the poretools times function - though easy to regexp, it might make it easier for people to generate their own occupancy plots.

arq5x commented 10 years ago

Great suggestion. Just added it to the repo.

timp0 commented 10 years ago

This pore layout plot (i have my own version by stealing your R code) - is globally useful for more than occupied - I suggest also allowing pore yield to be plotting in this way?

timp0 commented 10 years ago

Also, may want a plot of (length of time empty) as a function of run time - I found (plot below) - with orange empty duration and blue occupied duration - can you guess when we added more library?

image