GreenInfo-Network / seattle-building-dashboard

Energy benchmarking for Seattle
https://greeninfo-network.github.io/seattle-building-dashboard/
ISC License
1 stars 0 forks source link

Histogram colors do not match map colors, especially 0 #77

Closed tomay closed 8 months ago

tomay commented 8 months ago

The histograms that accompany the map layers on the left panel also act as a kind of legend.

For some reason, the colors represented do not always match the full range of colors on the map.

Here is GHG Intensity for example.

Histogram:

image

Map:

image

The blue color in particular, is not represented, although that is the color for total_ghg_emissions_intensity = 0

This is not a new issue, and has probably been present all along. In fact there is a cryptic explanation by way of a comment in the building_bucket_calculator.js colorGradient prototype function:

// This is how we calculate the colors for the dots on the map. // But they don't line up with the colors in the histogram. Why not?

// The domain is "fieldValues", which is an unordered list of all of the building value for this field. // But the domain for the histogram color ramp is just linear max and min for the given field. // And more importantly, it needs to be the max and min that's set according to the config file. That's how the colors get determined in the histogram

The config file seattle.json does set the proper min and max.

Here's total_ghg_emissions_intensity:

        {
            "title": "Seattle GHG Intensity",
            "field_name": "total_ghg_emissions_intensity",
            "display_type": "range",
            "range_slice_count": 18,
            "section": "Greenhouse Gas Emissions",
            "color_range": ["#1f5dbe","#599b67","#ffd552","#da863f","#ab2328"],
            "hatch_null_css": true,
            "unit": "Kilograms CO₂e/ft²",
            "formatter": "fixed-1",
            "filter_range": {"min" : 0, "max" : 10},
            "

So I don't know if this is an oversight, or what exactly is going on

tomay commented 8 months ago

The way this works is incredibly complicated and circular

  1. D3 is used to set up a quantile scale, based on the color range specified in seattle.json and the values for the given variable. In the case of total_ghg_emissions_intensity, that amounts to a domain of ~3600 values, and a range comprised of the original colors spread across 18 derived colors (unclear why not let d3 derive the colors in the scale function itself? Most likely to give a bigger spread to the map, which is hard wired CartoCSS ranges, not D3 functions, see below).

  2. But then, when the actual color is applied to a bar on the historgram, the xpos of the bar (a pixel value?) is fed into yet another D3 scale (linear), before getting passed to the original quantile colorScale.

  3. This same quantile scale is used indirectly to make the CartoCSS. In the end, to me, this seems much more akin to a threshold scale (directly specify the cut values that separate the classes) than the original quantile scale (intervals of similar sizes), as you can see in the resulting CartoCSS statements and the code. Each "stop" is derived from a call to d3.scale invertExtent on each color in the range - which is the equivalent of a threshold

  4. The end result is that a value of 0 gets the expected color value for 0 on the map, but in the original scale function (xPos passed to a linear scale, passed to a quantile scale), a value of 0 is assigned to the color class #b5bb5b, which is the 7th position of 18 in the defined range, even though the actual data value is 0.

(18) ['#1f5dbe', '#306fa5', '#40808c', '#519273', '#6ba165', '#90ae60', '#b5bb5b', '#dac857', '#ffd552', '#f7c34e', '#efb24a', '#e6a045', '#de8f41', '#d57b3c', '#ca6537', '#c04f32', '#b5392d', '#ab2328']

Questions and observations

tomay commented 8 months ago

After a bit more thought, I think this is a wontfix

Each bar in the histogram represents a large range, not a discrete value. That's what the linear scale on the xpos is trying to model

The first bar in Total Seattle GHG Emissions, for example, isn't just representing 0, it's representing a range of values, something like 0 - 20 (there are no scales printed, so it's not easy to say exactly):

image

Buildings, on the other hand, are painted according to a specific value for that one building.

There is no way to "fix" this. The histogram is not a legend, it is a broad brush stroke showing the range and distribution of all the values