Closed pl-ki closed 6 years ago
So this was a very heavy birth, and we're not quite done yet...
First, remember that I'm red-green color-blind. So I'd like to accomodate the color-blind when fixing these categorical colors.
I tried tons of things, see here for an unorganised mess of the scripts used.
The colors coming out of iWantHue were not remotely good enough to me (even with colorblind mode), and the website also crashed all the time when generating huge color ranges. Amit also gave me a set of colors but they weren't quite satisfactory.
So in the end I decided to manually determine the range of colors. After looking for useful color spaces, I ended up with HSLuv, which is a kind of hybrid between HSL and LUV.
Quoting the comment in generate_colors_hsluv.js
:
Eyeballing, distinct ranges seem to be:
0-360 hues (that is, full color wheel)
40-100 saturation
20-86 lightness (below and above this, colors fade too much)
However, eyeballing with my own protanomally as a
"worst case baseline", assuming other color-vision
deficiencies are similar (but with other hue ranges),
the following rough rules of thumb seem to hold:
- changing hues with steps of 60 is a minimum
at all ranges, so six variations for hue
- lightness brings relatively high shifts in
perception for all saturations and hues,
steps of 6 seem ok, giving twelve options in total:
[20, 26, 32, 38, 44, 50, 56, 62, 68, 74, 80, 86]
- color distinction peaks with 62 and 68 lightness,
so these have a preference when dealing with
fewer categories
- for 44, 50 and 56 and 62 lightness, saturation steps
40, 70 and 100 are available for distinction (3 options)
- 32 and 74 lightness, saturation should be either
60 or 100 to avoid overly similar colors (2 options)
- 20, 26, 80 and 86 lightness should only have 100 saturation
- 20 and 86 lightness have really low hue distinction, so let's use
only two far opposing ones.
- improving distinction between close colors is most
important. Since hues are cyclic, and we only care
about the distance between hues for a given L and S,
we can make a "checkerboard pattern" when varying
these values by alternating a +30 hue offset.
That means that identical hues are separated
by at least two steps of L+S, which hopefully
improves things a bit.
That leads to a max total of 114 HSLuv colors, many of which
will be very close of course.
Note that perceived color distance is not weighed equally,
even with the step sizes defined above. For me, the following
holds true when I compare single-step changes in hue, saturation
or lightness, with the other three staying equal:
- Lightness has a bigger impact on perceptual changes,
and does not depend on colorvision
- Saturation second
- Hue the least (especially at lower saturation/lightness)
To generate an order that respects this, I came up with
Bresenham-inspired accumulators:
- Take each dimension (hue, saturation lightness), make
accumulator for each value in each dimension.
- Determine weights for each value
- Pick colors one by one. Each step:
- add weights to accumulators
- determine which unpicked color has highest sum of
hue + saturation + lightness accumulators
- pick this color
- set matching accumulators to zero
Doing this will ensure that values with high weights get picked more often, but not in quick succession. By tweaking both weights and initual accumulator value we can make influence how the selected colors are distributed:
- Use the more saturated colors before the others
- Use lightnes ranges 62 and 68 first, since color distinction is highest in these ranges
- Alternate strongly between light intensities
when possible
After quite a bit of tweaking with the weights this resulted in a list that was a good first start, but perceptually identical colors still ended up close. Furthermore, ideally we'd put the more visually pleasant colors first, while still making use of different types of contrasts (lightness, saturation and hues). That required manual intervention. So I had to come up with a few functions to make it easy to swap around and inspect colors in an array in the browser console until the results looked ok:
To make things worse, I could not do this manual thing alone because I'm red-green colorblind. So you can all thank Irene Izquierdo, my girlfriend, who was willing to spend two afternoons of the holidays helping out. We basically kept swapping colors around until the first twenty looked pleasant to both of us, and all other ones had at least decent contrast with their immediate neighbours.
I think the result is a decent enough compromise for now:
It's not perfect, but this is already taking up way more hours than I want to put into them, and I have quite the to-do list.
While we were at it, we also updated the box-colors, because they apparently were a lot more purple than I noticed:
One thing that is left to do is exporting all the metadata (and eventually all rows), because we pre-calculate all the unique values, and up until now we just took the twenty most common ones. The new expansion code increases this to 1000 (with the viewer just cycling through them).
Another thing that is yet-to-be implemented is the scatterplot using different shapes for repeated colors.
Fine - I think the main point is that ALL clusters are displayed with some color, even if the first 20 are repeated - the likelyhood that adjescent clusters get very similar color is not that high at < 50 clusters, and we seldom have more (current peak is around 250).
Using different shapes as you mention will probably be beneficial.
Peter Lönnerberg Linnarsson Lab Laboratory of Molecular Neurobiology Dept. of Medical Biochemistry and Biophysics Karolinska Institutet SE-171 77 Stockholm Sweden
Från: Job van der Zwan [notifications@github.com] Skickat: den 2 januari 2018 20:21 Till: linnarsson-lab/loom-viewer Kopia: Peter Lönnerberg; Assign Ämne: Re: [linnarsson-lab/loom-viewer] Color all cell clusters even when they are > 20 (#141)
So this was a very heavy birth, and we're not quite done yet...
First, remember that I'm red-green color-blind. So I'd like to accomodate the color-blind when fixing these categorical colors.
I tried tons of things, see herehttps://github.com/JobLeonard/colorspace for an unorganised mess of the scripts used.
The colors coming out of iWantHuehttp://tools.medialab.sciences-po.fr/iwanthue/ were not remotely good enough to me (even with colorblind mode), and the website also crashed all the time when generating huge color ranges. Amit also gave me a set of colors but they weren't quite satisfactory.
So in the end I decided to manually determine the range of colors. After looking for useful color spaces, I ended up with HSLuvhttp://www.hsluv.org/, which is a kind of hybrid between HSL and LUV.
Quoting the comment in generate_colors_hsluv.jshttps://github.com/JobLeonard/colorspace/blob/master/generate_colors_hsluv.js#L376:
Eyeballing, distinct ranges seem to be:
0-360 hues (that is, full color wheel) 40-100 saturation 20-86 lightness (below and above this, colors fade too much)
However, eyeballing with my own protanomally as a "worst case baseline", assuming other color-vision deficiencies are similar (but with other hue ranges), the following rough rules of thumb seem to hold:
changing hues with steps of 60 is a minimum at all ranges, so six variations for hue
lightness brings relatively high shifts in perception for all saturations and hues, steps of 6 seem ok, giving twelve options in total: [20, 26, 32, 38, 44, 50, 56, 62, 68, 74, 80, 86]
color distinction peaks with 62 and 68 lightness, so these have a preference when dealing with fewer categories
for 44, 50 and 56 and 62 lightness, saturation steps 40, 70 and 100 are available for distinction (3 options)
32 and 74 lightness, saturation should be either 60 or 100 to avoid overly similar colors (2 options)
20, 26, 80 and 86 lightness should only have 100 saturation
20 and 86 lightness have really low hue distinction, so let's use only two far opposing ones.
improving distinction between close colors is most important. Since hues are cyclic, and we only care about the distance between hues for a given L and S, we can make a "checkerboard pattern" when varying these values by alternating a +30 hue offset. That means that identical hues are separated by at least two steps of L+S, which hopefully improves things a bit.
That leads to a max total of 114 HSLuv colors, many of which will be very close of course.
Note that perceived color distance is not weighed equally, even with the step sizes defined above. For me, the following holds true when I compare single-step changes in hue, saturation or lightness, with the other three staying equal:
Lightness has a bigger impact on perceptual changes, and does not depend on colorvision
Saturation second
Hue the least (especially at lower saturation/lightness)
To generate an order that respects this, I came up with Bresenham-inspired accumulators:
Take each dimension (hue, saturation lightness), make accumulator for each value in each dimension.
Determine weights for each value
Pick colors one by one. Each step:
Doing this will ensure that values with high weights get picked more often, but not in quick succession. By tweaking both weights and initual accumulator value we can make influence how the selected colors are distributed:
After quite a bit of tweaking with the weights this resulted in a list that was a good first start, but perceptually identical colors still ended up close. Furthermore, ideally we'd put the more visually pleasant colors first, while still making use of different types of contrasts (lightness, saturation and hues). That required manual intervention. So I had to come up with a few functionshttps://github.com/JobLeonard/colorspace/blob/master/styledconsole.js to make it easy to swap around and inspect colors in an array in the browser console until the results looked ok:
[image]https://user-images.githubusercontent.com/259840/34494928-0ddae65a-eff3-11e7-8a2b-441ebb3bcde6.png
To make things worse, I could not do this manual thing alone because I'm red-green colorblind. So you can all thank Irene Izquierdo, my girlfriend, who was willing to spend two afternoons of the holidays helping out. We basically kept swapping colors around until the first twenty looked pleasant to both of us, and all other ones had at least decent contrast with their immediate neighbours.
I think the result is a decent enough compromise for now:
[screenshot_20180102_184347]https://user-images.githubusercontent.com/259840/34493619-0776f034-efed-11e7-8326-1a31d5d420ff.png
[screenshot_20180102_182514]https://user-images.githubusercontent.com/259840/34493620-07b50e96-efed-11e7-8779-7c8808e7d31b.png
[image]https://user-images.githubusercontent.com/259840/34496580-5252a280-effa-11e7-9864-08b6f8338ad2.png
It's not perfect, but this is already taking up way more hours than I want to put into them, and I have quite the to-do list.
While we were at it, we also updated the box-colors, because they apparently were a lot more purple than I noticed:
[screenshot_20180102_182337]https://user-images.githubusercontent.com/259840/34493622-0806734e-efed-11e7-90ee-d7c5582d8ca2.png
[screenshot_20180102_182406]https://user-images.githubusercontent.com/259840/34493621-07d981e0-efed-11e7-8734-420126b3bd39.png
One thing that is left to do is exporting all the metadata (and eventually all rows), because we pre-calculate all the unique values, and up until now we just took the twenty most common ones. The new expansion code increases this to 1000 (with the viewer just cycling through them).
Another thing that is yet-to-be implemented is the scatterplot using different shapes for repeated colors.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/linnarsson-lab/loom-viewer/issues/141#issuecomment-354852552, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AInXbyK9SVUbztcte8J8hnNtsJu5X8m1ks5tGoGogaJpZM4RKBEB.
Implemented, except for the shapes, which will come later
The code for this is ready, the only thing holding it back is the colorselection, but I finally figured out a system that is giving me the selection I'm happy with