odileeds / ukpowernetworks

UK Power Network's potential future scenarios as defined in the National Grid Future Energy Scenarios (DFES).
1 stars 1 forks source link

Change colour gradient #18

Closed ash001j closed 3 years ago

ash001j commented 3 years ago

After discussion with Colin, we would like to change the colour scheme from continuous gradient to one with five set colours. This will make it easier for people to read the map when looking at a static image of the map. Hopefully you can see the example I've attached, but please let me know if you would like more clarity from us.

Legend_Green_Transformation_ElectricCars

slowe commented 3 years ago

First, I'll just echo my warning that this introduces artificial steps in the data that could be misleading. It isn't something I recommend for an interactive visualisation with continuous variation (no categories or ranges with meaning) where there are other ways for values to be read.

However, given that it is your preferred option, we will need a bit more information about how you want this implemented. Different implementations will require vastly different amounts of work. The example given shows unequal categories. Whilst equal categories can be automated from the data, unequal categories will need to be defined by you. If you wish to have unequal categories you will need to define a scale for each scenario/parameter combination within each year and across the whole period to 2050. I will also need to create a method for these to be encoded and loaded.

ash001j commented 3 years ago

I understand the potentially misleading steps being an issue, but I think the lack of clarity when looking at the static image is more important to fix. Unless it would be possible to generate the categories on the fly, so when you export the map as an image, it generates category bins and only prints them in the static image.

For the categories themselves, by 'equal categories' do you mean the size of the categories are all equal? i.e. 0-50,50-100,100-150 and so on. If so, does it specifically need to be numerically equal? could it be equal quantile? i.e. lowest 20%, middle 20% - 40%, middle 40%-60%, middle 60%-80%, and then upper 20%? Thanks.

slowe commented 3 years ago

@ash001j I would prefer it to be numerically equal. Making quintiles involves more work and brings in issues around labelling and visual impact. What do you do in situations where getting five equal quintiles is hard e.g. if you had a distribution 1,1,1,1,1,1,1,1,1,1,1,2,2,2,3? In cases where the distribution is very uneven you have the potential to make very different places look the same e.g. 1,1,1,1,1,1,1,1,2,100 would give 1-1,1-1,1-1,1-1,2-100 as the quintiles. If you choose to go with "20-40%" type labels you remove the ability of people to know the values. The visualisation colours are therefore likely to be non-linear (in value; linear in quintile) which can be confusing. If you chose not to show the quintile labels (e.g. "20-40%") you need to then decide how to label the boundaries. Do you pick the mid value between the highest value in one quintile and the lowest in the higher quintile? Do you pick the lowest value in a quintile? These numbers will (most likely) be at less-than-friendly numbers.

ash001j commented 3 years ago

Numerically equal would be easier, but in some instances it could lead it quite unequal distribution. For example in the early years of EV uptake where it is concentrated in a few LSOAs with much higher EV numbers than most other LSOAs, e.g. in your example of 1,1,1,1,1,1,1,1,2,100. The equal classes would be 1-20 (9 elements), then 20-40, 40-60,60-80 would all be empty, with only 80-100 having another element in it.

Also I'm not sure what you mean by the issue around labeling the boundaries, they would just be labelled as the range they capture, e.g. "20 - 40". However, if it involves more work to create a method for encoding and loading, then using equal classes may be the best way forward.

Finally, did you see my comment asking if it's possible to generate categories on the fly, and only show them when someone wants to print the map as an image. So when the map is being used online it stays with the continuous gradient? I think this would be the best of both worlds as it would keep the accuracy of the continuous gradient and make the static map easier to read.

slowe commented 3 years ago

@ash001j The data are, generally, of an unequal distribution. You seem to have strongly indicated that you want the visualisation to look as though there are an equal number of areas (LSOAs/LADs etc) in each category, hence why you are suggesting quintiles.

The issue I was attempting to explain around labeling the boundaries was because of your example at the top of the thread that uses value-based labels. It wasn't about "0-20%", "20-40%", "40-60%", "60-80%", "80-100%" style labels (these obviously have different problems with understanding).

Yes I saw the comment about only showing these categories in the event that someone "wants to print the map as an image". This suggestion involves dramatically more work because I still have to code the category situation but then have the additional work of having to define multiple ways of styling and correctly switching between them. Also, you should note that there is no way for me to change the colour scheme in the event that someone uses their inbuilt browser screengrab tool or the Mac/Windows/Linux system screengrab tools. Also you will have the confusing situation that someone will ask for a screenshot and what they get is not what they had been looking at.

ash001j commented 3 years ago

Fair point on distribution, and thanks for the clarification on labeling issues. I tend to prefer using quintiles but seeing as it's simpler to code and easier to read, let's use equal classes instead.

Shame about the class generation on the fly, it would have been a neat feature, although I see the potential confusion about getting a screenshot that wasn't what you initially looked at.