ramnathv / rblocks

A fun and visual way to learn data structures and control flow in R.
26 stars 9 forks source link

Set colors based on data type #3

Open ramnathv opened 10 years ago

ramnathv commented 10 years ago

I have this working right now (need to push to repo). The idea is to use colors to encode data type.

x  = list(a = 1:2, b = LETTERS[1:3], c = c(T, F, F, T, F))
make_block(x)

rplot01

This allows more interesting concepts to be explored. For example, coercion of data types in a vector. For example, in this vector, TRUE gets coerced to numeric as a result of which x is displayed fully as light blue.

x = c(TRUE, 1)
make_block(x)

rplot02

Questions to Resolve

  1. What is a good color palette to use to encode this?
  2. How to make it obvious what color is associated with what type?
polytechnantesINFO3 commented 10 years ago

Looking forward to using rblocks for visualizing the ensemble clustering process. Great work, thanks !

mbannert commented 10 years ago

Wow, great idea. W.r.t to your scheme question, I'd like to encourage you to think of color blind people. Not speaking of black and white :). Btw: I like the current color scheme.

ramnathv commented 10 years ago

@mbannert Thanks. This color palette is chosen from http://colorbrewer2.org/ and is colorblind safe.

polytechnantesINFO3 commented 10 years ago

Nice too, I am color blind, by the way.

ramnathv commented 10 years ago

Here is another cool application. Consider this list

x  = list(a = 1:3, b = LETTERS[1:3], c = c(T, F, F))
make_block(x)

rplot03

Now, make x into a data.frame and view the block again. The second column is now displayed as an integer. Why on earth is that happening. Well welcome to the evilness called factors. R is very fond of converting characters into factors wherever it can, and factors are stored as numbers, which is why you see both the first and third columns colored light blue.

y = as.data.frame(x)
make_block(y)

rplot04

kevinushey commented 10 years ago

Here's some of my thoughts on appropriate colour scheming. It'd be nice to choose a colour scheme that can reflect similarity / disparity of types. Broadly, if we group things into quantitative and qualitative, we might have:

Quantitative: numeric, integer, complex, maybe some of the date / time classes values as well. These could occupy the 'blues'.

Qualitative: character, factor, logical. These could occupy the oranges.

And then types that are 'rare' or more internal:

Internal: symbol, language, environment, expression, etc. Maybe a brown / purple?

Based on your last example -- are you seeking more to represent internal storage types or the actual class / function that a type is used for?

ramnathv commented 10 years ago

You are right. After I posted this I refactored code so that a custom function can specify a mapping from a property of an object to a colour. This will allow use of mode, type of etc., and also provide flexibility with palettes. My only worry is that this will introduce fragmentation. Ideally, I would like all rblocks lessons to be consistent, so that they can be easily reused.