Encode absolute count to circle size

trvrb commented 5 years ago

@jameshadfield, @jotasolano ---

I believe the choropleths were not presenting the core thing they needed to present, that is the count of encounters in a region. Incidence is the primary data variable we want to display. Sex, age, vaccination, etc... are secondary colorings. There is no easy way in the existing app framework to display a very simple thing:

subset to H3N2 and look at counts of H3N2 cases across the map

The map always displays the "secondary" data variable (vaccination, etc...).

This PR addresses this by swapping the choropleth for a circle sized according to absolute total count from this region. There is certainly more to do here in terms of piecharts vs circles and color embeddings, but I believe this is separable from choropleth vs circle.

Here are two examples of changes in app behavior:

Neighborhood / vaccination via choropleth:

Neighborhood / vaccination via circle:

Census tract / sex via choropleth:

Census tract / sex via circle:

You can especially see in the census tract version that many of choropleth colorings are based on a single data point and are overly emphasized. By having these just be tiny circles these census tracts are properly deemphasized. I can see meaning in the latter, but not in the former.

jotasolano commented 5 years ago

Hi Trevor, please see my answers below:

I believe the choropleths were not presenting the core thing they needed to present, that is the count of encounters in a region. Incidence is the primary data variable we want to display.

I agree with you! When we look at our boards for "competitive analysis", the incidence is the primary metric that is shown (or at least the most general). In the inVision mockups, this is shown in the first slide, and you could get to it by selecting the "prevalence" variable under the "modeling" mode (https://invis.io/J3QDSNMT2D9#/353703459_1_Dynamics_modeling_prevalence). The missing table there is the one in Observable:

There is no easy way in the existing app framework to display a very simple thing[...]

Maybe what we did in that InVision mockup is the way we could implement this ☝️, although I'm still curious to test this flow and see whether people consider that "incidence" belongs to "modeling" (semantics/mental-model wise).

This PR addresses this by swapping the choropleth for a circle sized according to absolute total count from this region.

I think this is a step in the right direction! However, I do have concerns about multiple variable encoding and the cognitive load required for a good-enough interpretation of the charts: When I look at the last two pictures (incidence in choropleth vs as radius of circles) I have a hard time interpreting the latter. This is probably because the circles are encoding two variables (size of the population as radius and incidence as color). It's also likely that the color scale is not ideal, as a single color ramp would be more adequate for these type of data (e.g. light red to dark red).

It's also known that color perception is affected by its area and neighboring colors, so having very small circles can also pose a problem for the interpretation of the data. I think most incidence visualizations solve this by not using raw values and instead normalizing the numbers. I guess the question we need to ask here is: am I interested in looking at raw values and total incidence, or am I interested in being able to compare incidence across every deme/region? Perhaps we could provide a toggle between a normalized choropleth and a bubble map with absolute values?

Finally, the heat map I implemented should work almost out of the box with incidence data, and we'd get an incidence table much like the one we have in Observable. This would accompany the map nicely, I think.

jameshadfield commented 5 years ago

The map always displays the "secondary" data variable (vaccination, etc...).

Nore that this is called the "Primary" variable in the viz. I think all that's missing is the correct options for this variable. I.e. the default option should be "incidence" or "counts".

Would be happy to discuss things in person next week.

seattleflu / genomic-incidence-tracker

Encode absolute count to circle size #12