Add interlaced palettes

dpordomingo commented 4 years ago

fix #303

This PR converts srcdMain and srcdAll into interlaced palettes.

You can see the new srcdMain (interlaced) in the first row, and its previous version (gradient-like) in the second row.

[x] I have updated the CHANGELOG file according to the conventions in keepachangelog.com
[ ] This PR contains changes that do not require a mention in the CHANGELOG file

dpordomingo commented 4 years ago

~Pending to clarify:~

~should we keep the old palettes as srcdMainGradient and srcdAllGradient? or would it be better if we remove them, and then use only the new interlaced ones?~

dpordomingo commented 4 years ago

Problems I see:

The requirement is to avoid showing similar colors one next to the other.

It can not be guaranteed always as it can be seen in the screenshots below :point_down: It happens because Apache Superset assigns the same color to an element in a chart if that same element already appeared in another chart using the same palette. That's the reason because in the third chart: JSON, Python, and Text are all of them using orange: because that's the color assigned in the first chart where they were found.

Why is it happening?

Because Apache Superset is designed that way on purpose (that's what I found googling), and it also happens with their palette. It can even happen that the very same color appears twice, one next to the other, as it can be seen from the screenshot I posted when pull requesting the source{d} palettes. As you can see: both Juanjo and Alfredo have the very same pink color.

rpau commented 4 years ago

@dpordomingo The Apache SuperSet criteria for colors makes sense to me. Does it make sense to consider that we are not providing the charts in the appropriate dashboard? I am assuming that if we use a different dashboard for that chart, the problem disappears. Am I right?

rpau commented 4 years ago

Regarding keeping the old palettes, I would say not. We'll check git log for that :P

dpordomingo commented 4 years ago

[...] Does it make sense to consider that we are not providing the charts in the appropriate dashboard? [...] if we use a different dashboard for that chart, the problem disappears[...]

I don't see your point @rpau

If I understood "the problem", is that "similar colors can appear together". It can happen in one chart if you have —in the same dashboard— another chart using the same palette, and the same items with a different order, e.g.

chart one: language by lines (go:purple1, python:green1, java:purple2)
chart two: language by bytes (go:purple1, java:purple2, python:green1)

Regarding keeping the old palettes, I would say not. We'll check git log for that :P

It was not about storing them for historical reasons, but about keeping them for using them in other charts, where we could want to have this gradient effect. Can I assume @ricardobaeta @rpau that we're also deprecating the old srcdMain and srcdAll, so I can safely delete both?

rpau commented 4 years ago

Thanks @dpordomingo .

It can happen in one chart if you have —in the same dashboard— another chart using the same palette, and the same items with a different order

Sorry, my message was not clear enough. Exactly, this is what I understood. My point is that I think that Superset is right from a user point of view. Having different legends/colors for the same "elelement" types produces confusion.

So, IMHO from a consistency point of view this is the behavior expected. If it causes confusion is because probably we are merging analytics/charts that might require a different dashboard/view.

It was not about storing them for historical reasons, but about keeping them for using them in other charts, where we could want to have this gradient effect. Can I assume @ricardobaeta @rpau that we're also deprecating the old srcdMain and srcdAll, so I can safely delete both?

Yes. If they are not currently used, I would prefer to remove them.

dpordomingo commented 4 years ago

Yes. If they are not currently used, I would prefer to remove them.

Previous palettes (gradient versions) were just deleted.

marnovo commented 4 years ago

@ricardobaeta is doing further internal user testing before we merge it (see Slack messages from this one).

Ricardo Baeta Current: Makes it hard to distinguish Charts Sections. We're using sequential gradients.

Ricardo Baeta Solution: We have a clear distinction between data sections, and we still reinforce data representation with our full brand colour spectrum, using interlaced gradients. (edited)

Feedback so far:

Marcelo Novaes Any alternatives? The tones of a given color still look very similar to me, even in good displays. So it’s true you don’t confuse one to next other, but yes between themselves.

Ricardo Baeta Thank you for the feedback :+1: It's true we could make the gradients more distinguishable, but we would have very light colours on the end of the gradient spectrum.

Vadim Markovtsev I am a data scientist and I cannot draw any conclusions from the color bars. Too much information. The "current" bars are much better to me.

Lou Marvin Caraig my only feedback is that the “new” palette seems a bit sad, I mean the majority of colors are not bright as current (edited)

Lou Marvin Caraig I don’t know why I have this perception :sweat_smile:

Replying to the Slack messages above:

@se7entyse7en, this is likely due to mismatching of color profiles, (not) being embedded in the screenshot images and/or (not) being rendered in a given app it is displayed. If you compare David's screenshot on this PR in a browser vs image on Slack you'll notice they're different.

@vmarkovtsev, let me try to “translate” what you meant and add a few bits:

With the “Current” sequential shades of a given hue, it’s more linear for the brain to move and "stop" where hue start/end, and easier to know where you were are and to jump back/forth because we “group” the colors of the same hue. The main downside is that, if the tones/shades* of a given hue are similar & together, they become less distinguishable (current problem), plus it may give some sense that tones of a hue have something in common (may or not be true)
With the proposed “Solution” of interlaced gradients alternating hues and shades, it's harder for the brain to set these yardsticks on where you are at some point, thus giving this feeling of noise or getting lost ("was I in the 2nd or 3rd orange?"). The advantages are that in case of sequential data series it's easier to differentiate between them, and that you don't convey a sense of "grouping" on data that's not really a group.

* Some help on color theory

dpordomingo commented 4 years ago

I'd say that we should consider the problem in an analytical way.

I'd ask:

what's the purpose of the colors in these charts?

a) differentiate different items in the chart,
b) identify the same item because it's repeated in different places of the same chart,
c) identify how one value changes along with another metric (e.g. commits per day),
d) branding.
(other than we could identify)

I think this an important answer because its answer would force us to choose one certain palette, or even more, one certain chart.

I also wonder if we could develop a palette that would satisfy both a and b.

a) differentiate contiguously items

From the issue description, I'd say this is the case because people complained about being "[...] hard to see the different sections"

The pie chart from the example would be this case, so since the same item is not repeated, the user only needs to differentiate contiguous items. The edge case would be a two-color palette if one item can be surrounded by only two items, or four color palette if distributed in a map. If using or many colors, including many shades of the same tint (as in our case), the sorting is crucial because it is needed that two consecutive colors can be clearly contrasted.

If we consider the Apache Superset color assignment rules, —which does not rely on palette order (see https://github.com/src-d/sourced-ui/pull/305#issuecomment-543327784)—, the palette should have:

as many contrasting colors as possible,
as fewer shades of the same tint as possible.

An alternative could be choosing charts where the gap between items is clear, no matter the color they have: an edge case could be two consecutive items colored in the very same purple color, that can still be differentiated because both items are separated by a white gutter or a black border.

b) identify the same item, which is repeated in many places of the same chart

We need to use a palette which MUST contain (at least) as many colors as different elements will be repeated, otherwise, at least one color will be reused in more than one different element, so we won't achieve this goal. Imo using more than 2 different shades of the same color can difficult this goal because when those shades are separated enough it becomes difficult to differentiate all the different purples.

The bar chart of the example would, imo, an example of this situation, because the same repo will be in many columns. In that example each repo should have its very own color, to be clearly identified in each column. Since the example above contains more repos, than colors, the goal of "identifying repeated items" can not be satisfied.

I'd purpose something like, as it is explained in https://github.com/src-d/sourced-ui/pull/305#issuecomment-545271704:

c) identify how one value changes along with another metric (e.g. commits per day),

In general, it would be better if we use gradients with only one tint (to represent a single range), or with only two tints (e.g. to represent two sides from a midpoint).

Our current srcdMain and srcdAll palettes, use gradients of three and six different tints, which is great to represent altitudes in a topographic map but might be not our case.

d) branding,

If that's the case, we should only use plastic criteria.

marnovo commented 4 years ago

I feel we might be getting into a series of local maxima (a.k.a. playing Whac-A-Mole with bugs) with this.

Story so far: We implemented our custom default color scheme for charts on UI to improve UX and strengthen the brand—same as the bigger UI branding project. User testing was not thorough and later a number of people internally promptly noticed some usability issues at different degrees since we rolled it out and demo-ed. We listed a few of them, solved one earlier and now trying to solve another, and a few others are arising from there…

Let's for a moment take a step back and look at the bigger picture and how we got here, think of the problem as a whole, then assess if the marginal problem is worth solving, when and specially how so we're not repeating the same mistakes in the process. So keep asking why & how, and aiming at the output & outcome…

Some context to look at the why & outcome, and think of the how & output:

Is it important? Yes, the default color scheme display is an important feature of the product—it's a data consumption tool and data visualization is a major aspect of it, and the default is on built-in charts as well as most people ever change from defaults.
Is it impacting? It seems yes, the current issues with the color scheme are noticeable well/frequently enough that decreases the quality of user experience. But we don't know how much and in what aspects well enough, must have more confidence here to justify further intensive work.
Is it pressing? Yes, the issues are in the live product at CE/EE, and specially with the new projects kickstarting in 1-2 weeks, we want to fix sooner rather than later.
Is it complex? Yes, a problem that has no perfect answer, given the quite a number of stronger/weaker constraints, and certain level of subjectivity (see full list).
Is it of our own? Yes, we introduced this new color scheme because we wanted to improve the UX via UI, as we did with other aspects of the original superset.

Given it's important (1) & likely impacting (2) we should deal with this at least to assess B and, given it's pressing (3), do it now and efficiently. Given it's complex (4), it's unlikely we can solve by sheer thinking & skill in a tiny group of people. Testing early key. At last, given it's of our own making (5), we should look whether we made it better or worse than it was.

This can be tackled in an iterative approach until the why is not justified by the outcome given the cost of how & output. Example:

Why: one of the key features of the product seems to have significant and impacting enough UX issues with enough users
Expected outcome: UX for users existing/upcoming product deployments is now significantly better and/or at least good enough according to our criteria (& maybe colors branded)
How: define aspects to evaluate the extent & impact of the problems, user-test a subset of color schemes on different charts based on real data, aggregate evaluations based on criteria, choose among the best-evaluated
Expected output: color scheme defined by evaluation criteria only then PR-ed to sourced-ui as the default and communicated

This above can be applied at different steps w/ different costs/scales.

A good a very simple first step that can be done in just a couple of days from the point we are right now, to validate the why, and make the how/output cheap and ensure an outcome:

Set a list of criteria* based on the problems reported, UI/UX knowledge on dataviz, common sense; and a target minimum "rating"
Test a subset of the available colors schemes in a very quick & cheap, semi-realistic setting: 15-20 internal users; the source{d} proposed + current and superset default + some other we think it's a good candidate color schemes; 4 main different chart types common on our dashboards;
1. Apply the palettes to a few different common charts types on some setting that's easy and quick, like tools available for this, superset UI (if just editing the chart and saving) or even Google Sheets with data exported from the Superset charts;
2. Send link to users on Slack; Ask users to rate each scheme based on the criteria available; ask users to rank criteria; can be straight in the google sheets, or a simple form + screenshots;
Adjust the criteria rank based on user feedback, calculate the ratings for each color scheme, see which make the cut / are the best rated;
1. If one or more make the cut, the best-rated one becomes the (new) default, only then code is written, PR is made and merged for new release to be tagged, communicate release
2. If none make the minimum cut, see the most efficient change that can be made to make one of the best ranked schemes to make it. Test it again with some of the most critical people on this one.

In the best case scenario, it might be that the issues we seem to have are minor enough we don't care enough to fix them right now, or that we find a good one.

In the worst case scenario, all palettes have problems, but at least we know which ones, to what extent, and can work iteratively to fix them and repeat the process.

*A list of goals and constraints for the problem at hand:

Colors should be visible on the white background
Colors should make overlaid text labels visible
Colors should be varied enough that in most real-life charts each data series is uniquely colored
Colors should be distinguishable when presented sequentially
1. The user cannot hand pick the color for a given data series in a palette
2. Superset has this data series <-> color consistency feature (bug) we probably don't want to change now
3. Superset had a color order randomization issue
Colors should work for most chart types that we use on dashboards
1. Colors should not break existing charts & dashboards
Colors should be distinguishable by vision deficients
Colors should be distinguishable on mid-range displays
Colors should help eye navigation and avoid noisiness/confusion
Colors should avoid conveying undue semantics, like groups that aren't groups, or "good"/"bad"
Colors should not be tiresome to look at for long periods
Colors should convey more neutral or positive than negative emotions
Colors should be simple to use—the simplest is not to have to think of them by default
Colors should relate to source{d} branding, so users know they are looking at a chart/product coming from source{d}

dpordomingo commented 4 years ago

Example of how important it is choosing proper palette and proper data when trying to show some info.

dpordomingo commented 4 years ago

I labeled this PR as blocked per https://github.com/src-d/sourced-ui/pull/305#pullrequestreview-305537592 Imo, it could be useful, @marnovo, if you'd review it as request changes

src-d / sourced-ui