skent259 / covid-19

Misc files and data things related to covid-19
0 stars 1 forks source link

graph enhancements: options for small multiples. etc #4

Open klittle314 opened 4 years ago

klittle314 commented 4 years ago
  1. like the geo facet approach
    1. The cum plot of cases in NYTimes also looks fine
    2. My design view: Allow any measure to be faceted by county as well as have statewide view.
    3. It is also useful to have a some kind of display that overlays traces e.g. your landing page or the analogue of my median benchmark graph on the clinic site or.....
    4. Measures should include conversion of counts to rates (normalize by county population)
    5. Hospitalization counts (or estimates) seem important?
skent259 commented 4 years ago

Can you elaborate on 4?

Right now I'm looking into 1 and 5. 5 should be easy and I think I have a good way to do 1.

klittle314 commented 4 years ago
  1. Idea: Given county choice (in dropdown), make visually dominant and have other counties in background, along with statewide summary (makes graphical sense only for rates otherwise MKE county dominates and shrinks count scale....unless you force log scale?
    image My example from clinic data: we choose Missouri Highlands clinic, it shows up in blue==county; there is a summary of the entire set of clinics (black)==state. Rather than just grey dots, could have trajectory (line plot).

I'll build an example and send you it this pm

Alternative to ggplot is something interactive (plotly?) where you can hover and click? I haven;t done interactive shiny plots.

aravamu2 commented 4 years ago

Do you mean something like this?

image

There is something analogous in plotly. I can create a pull request.

aravamu2 commented 4 years ago

Pull Request: Update testing_wisconsin-covid.Rmd #8 Commit: f3e3c1a6eda2b9134c115ab21a65e0ff289491d6

aravamu2 commented 4 years ago

What did you have in mind for geofacet? Did you mean something like this?

image

Ideally, it would be great to get hospital capacities as an upper bound for each county.

klittle314 commented 4 years ago

yes, that's what Sean intended. Allow normalize all measures (cases, tests, hospitalizations, ICU admissions and deaths by county population, e.g. https://www.indexmundi.com/facts/united-states/quick-facts/wisconsin/population#table. Cases need to be converted to hospitalizations to have a relevant meaning with overlay of hospital capacities...

klittle314 commented 4 years ago

In the overlay chart, also allow user a choice between counts and rates: Normalize by population puts the Wisconsin state case rate growth 'in the middle' and you can see by county relative to the state rate.. We expect Milwaukee, Dane and Waukesha counties to have more cases for multiple reasons primarily pop size and access to testing. Did you make in plotly to allow interactive mouse over of each trace?

skent259 commented 4 years ago

For geofacet, I made something similar to what @aravamu2 has. Agree that normalizing (there's a population variable in the dataset) makes the patterns emerge much better.

My problem is that it takes a long time to render, and I'm worried that will slow the shiny app down too much. Thoughts on how to address this?

aravamu2 commented 4 years ago

@klittle314 Agree that normalizing by population is better. I believe the option of counts and rates will be implemented in shiny. The plotly visualization does allow interactive mouse over of each trace.

image

Agree that converting cases to hospitalizations to have a relevant meaning. The best I can find are approximate hospitalizations rates at the state level.

@skent259 It takes a while to generate and render for me as well (~4.5 seconds). How slow is too slow? Also, I was wondering if you facet wrap or arrange plots into a grid manually?

aravamu2 commented 4 years ago

The best work around I can find is separating each set of visualizations to its own tab in shiny such that it renders separately.

skent259 commented 4 years ago

@aravamu2 I implemented the context into the shiny app, but I held off on implementing it in Plotly because the log10 scale looks terrible and I can't fix it. I found a way to get it to work in the colorbar, see https://github.com/skent259/covid-19/blob/master/Wisconsin-Covid-19/app.R lines 170-173. If anyone knows how to do something similar for the y-axis then I can make plotly work

I'm open to putting in a separate tab, but I'm not sure if the small multiples plot increases value when we have the map (which can show data at any day) and the context plot now implemented

aravamu2 commented 4 years ago

You will need an if statement using two different ggplot statements. For logscale, you will need to transform cases to log(cases + 1) manually. The issue for scale_y_continuous(trans = "log10") or scale_y_log10() is plotly transform cases to log(cases) automatically. Therefore, the lines do not connect. Hope this helps!

skent259 commented 4 years ago

I should have been more specific. There’s a difference between showing the log values and showing things on a log scale. Log values are not interpretable to more people, so we need to show cases on a log scale. Doing the manual plotly method you suggest shows the log values.

I found a manual workaround for the color scale by manually putting the tick text at the right spot, but I don’t have the time right now to look at the plotly documentation to figure out how to do it for the y axis

On Mar 27, 2020, at 1:03 PM, Srikanth Aravamuthan notifications@github.com wrote:

 You will need an if statement using two different ggplot statements and transform cases to log(cases + 1) manually. The issue is plotly transform cases to log(cases) automatically. Therefore, the lines do not connect. Hope this helps!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

aravamu2 commented 4 years ago

Good point! The best solution I can find is adding a pseudocount of 1. However, I think the issue is with the values reported for "2020-03-12" or lack thereof. I think it can be fixed with filtering out the cases for "2020-03-12". Take a look and let me know what you think. Hopefully, this works?

Pull Request: Update testing_wisconsin-covid.Rmd #10 Commit: 7c215111e74179579e31670ebbe39e69b67d6a4b