[public dashboard] Unify color scheme for labels, including dynamic labels

e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.

https://e-mission.readthedocs.io/en/latest

BSD 3-Clause "New" or "Revised" License

15 stars 32 forks source link

[public dashboard] Unify color scheme for labels, including dynamic labels #1047

Closed Abby-Wheelis closed 3 weeks ago

Abby-Wheelis commented 4 months ago

Problem:

After recent changes allowing dynamic labels to be displayed on the public dashboard, there is an issue where "the same color means different things" chart-to-chart (ex Car Drove Alone below). This prevents users from easily understanding the dashboard and makes it nearly impossible to visually compare the pie charts.

Screenshot 2024-02-01 at 5 15 38 PM

This happens because the "color mapping" is generated each time a chart is made with the line colours = dict(zip(labels, plt.cm.tab20.colors[:len(labels)])) and is based soley on the list of labels passed in to the function on this specific occasion influenced by different subset or ordering changes that can happen in datasets chart-to-chart.

Solution:

To fix this, we need some way of unifying "what color means what mode" and I am proposing that we have this be at a notebook-specific level, so the mapping is the same for the whole notebook, and then we just pass the color map into the chart-making-function instead of making it within the function.

For example, if the set of dynamic labels dictates that the modes are "walk, bike, car", then we would make a map that says "walk: blue, bike: green, car: red". If there are no dynamic labels, we can use the same method with the default labels. The process for purposes would be similar. Then the chart-making functions take the color maps as a parameter, and therefore use the same map every time.

We may need to consider what "kind" of labels the chart-making functions have (are they raw or cleaned) and map colors to either raw or translated/cleaned labels accordingly.

[@iantei, @shankari for visibility]

Abby-Wheelis commented 4 months ago

In the long run, we want to have this color scheme be unified with the colors used on the label screen and dashboard in the phone. And not only should the colors be unified, but the emissions factors and energy use as well.

On the phone (if I'm remembering correctly) we map the color according to a base mode, and then for each subsequent mode of that base mode we lighten/darken the color by a certain degree. For example, car is red, and then shared car ends up being dark red, because they share the base mode of CAR.

It was discussed today that we should consider having a library that handles this functionality, and I think the way we handle colors in the phone would be a good candidate to be moved to that library.

If/when we handle colors in a library, we could have a function that takes in a set of custom modes and returns a map of those modes to colors (according to the base-mode-modulation method) which could then serve as our color map for the chart.

This is more of a food for thought discussion at the present moment, but I think generating a color map from the labels in a more central manner in the dashboard code, and then later delegating that functionality to a central location, seems like a plan that would work keeping our later priorities in mind.

iantei commented 4 months ago

PR for fix: https://github.com/e-mission/em-public-dashboard/pull/117

iantei commented 4 months ago

The current issue with different color showing up for same label is because of the following:

"color mapping" is based soley on the list of labels passed in to the function on this specific occasion influenced by different subset or ordering changes that can happen in datasets chart-to-chart.

The issue primarily seems to be in regards with the filtered labels passed to create different map for color to mode/purpose/replaced mode.

The proposed solution for this is following:

Extract list of unfiltered labels (mode/purpose/replaced mode) at the top of the notebook. And pass this in individual function to create a map of color to mode/purpose/replaced mode based on the existing implementation.

This way, we do not need to work in specifics with scenario where default mapping and dynamic config mapping is available.

Abby-Wheelis commented 4 months ago

The issue primarily seems to be in regards with the filtered labels passed to create different map

Correct, this is what I was trying to say when I was saying that they varied by occasion and subset/ordering, filtering is a good way to look at it!

Extract list of unfiltered labels (mode/purpose/replaced mode) at the top of the notebook. And pass this in individual function to create a map of color to mode/purpose/replaced mode based on the existing implementation

So you're saying to create three different lists? Mode, purpose, and replaced mode and then pass one of those lists into the function, still making the color map within the function?

I think that would definitely work, since it keeps the list of labels used to create the map consistent between the charts. I think it might open more doors down the line when we get to unifying the colors with the phone to create the color map in the notebook and then pass it into the function instead of unfiltered labels. I might not be seeing challenges that exist with that, though, is there something blocking color mapping in the notebook instead of the function? Or do you have any thoughts about how passing unfiltered labels is better than passing a color map?

iantei commented 4 months ago

So you're saying to create three different lists? Mode, purpose, and replaced mode and then pass one of those lists into the function, still making the color map within the function?

Yes, I am proposing to create three different lists. Each with have a different mapping, which will help the end-user distinguish between different categories of charts in higher level too.

I might not be seeing challenges that exist with that, though, is there something blocking color mapping in the notebook instead of the function? Or do you have any thoughts about how passing unfiltered labels is better than passing a color map?

Initially, I thought this would be just a trade-off of creating map at the top-level of notebook instead of inside the pie-chart function. But there is a slight issue.

Let us consider the below scenario:

    all_ct = agg.get_data_df("analysis/confirmed_trip", tq)
    print("Loaded all confirmed trips of length %s" % len(all_ct))

In case, where the loaded dataset doesn't have confirmed trip for the recent timestamp like 2024/02. This results in:

Loaded all confirmed trips of length 0

Currently, the extraction of labels_mc = expanded_ct['Mode_confirm'].value_counts(dropna=True).keys().tolist() error is handled in try-catch block and is displayed as error in generating charts.

If we want to do color mapping in the notebook, we would need to extract list of Mode_confirm, Trip_purpose, Replaced_mode, which will result in above issue in case where there's no data items for the timestamp. Therefore, for now, I think creating color map inside the function by passing unfiltered labels seems like a good solution.

Abby-Wheelis commented 4 months ago

labels_mc = expanded_ct['Mode_confirm'].value_counts(dropna=True).keys().tolist()

Are you retrieving the list of labels from the data? I was thinking we could use the list from the labels (either default or dynamic) that way if today nobody has ever labeled "motorcycle", but tomorrow someone labels "motorcycle" the mapping does not change when the list of modes that exist in the data changes. If we use the list of labels (either dynamic or default) then we could have all possible labels, and it would be (almost) guaranteed not to change over time, since the list of possible labels is unlikely to change.

I think this is similar to what is done for sensed mode - there a static list of sensed modes that never changes. That way, if today nobody has ever gone fast enough for there to be a trip sensed as AIR_OR_HSR in the dataset, but tomorrow someone takes a long flight, the list of labels used to make the map, and therefore the color map, never changes.

I'm realizing we did just add to the list of labels hybrid drove alone and hybrid shared ride, so that "complete" list of labels just changed -- another good reason to aim for more unified color mapping down the line, and a reminder of why we should ensure the design of that library is extensible.

iantei commented 4 months ago

Are you retrieving the list of labels from the data?

Actually, yes. Ah, I see your point of concern here. For dynamic labels, it's easy to pass through the generate_plots.py as we are extracting the dynamic_labels from the config URL. I was just wondering how to extract the mapping for default one.

Maybe, we can make use of auxiliary_files/mode_labels.csv and auxiliary_files/purpose_labels.csv to extract the values for default mapping too. Better way might be to use the dictionary of purpose and replaced mode already available in the notebook.

shankari commented 3 weeks ago

This is on production, closing