stat231-s21 / Blog-library-cleanverse

Repository for PUG Blog Project – library(cleanverse)
https://stat231-s21.github.io/Blog-library-cleanverse/
0 stars 0 forks source link

Blog Plan #1

Open cpage23 opened 3 years ago

cpage23 commented 3 years ago

We do not plan for our blog to be a direct extension of our Shiny app. We enjoyed working on our first project, but we want to turn our attention to vaccination efforts in the United States. As such, we'll be focusing on different questions.

Datasets We're planning to draw from six datasets to carry out our research. First we have two CDC datasets. One contains information about the number of vaccines received and administered in each state, in addition to the type of vaccine used. The other contains more day by day information. We also have two datasets from the Tennessee department of health. One contains demographic information about the people who received vaccines. The other contains county vaccination records. Next, we have a dataset about presidential election results. Finally, we have a demographic dataset from the government of California, which largely mirrors the Tennessee dataset.

These datasets are all attached. They'll require some wrangling. cdc covid.csv CA demographics.csv COVID_VACCINE_COUNTY_SUMMARY.XLSX TN covid demographics (state).XLSX US vax by data.csv

Essential Questions

  1. Amaya Is vaccine rollout by race, ethnicity, and gender equitable in California and Tennessee? Are certain groups getting vaccinated first? Is there much of a difference in equity between California and Tennessee, two non swing states? I plan to create side-by-side line graphs that compare California and Tennessee’s vaccination rollout by race, ethnicity, and gender, quantified by percentage of each group vaccinated. I’ll also research vaccine rollout events, such as when the J&J was stopped, when certain groups (such as the Indian Health Service) began vaccinating subpopulations, etc and put them on the graphs!

  2. Kriti My initial question is: how did vaccine administration change overtime in the US? I will plan to make a map of the United States, and each state will be colored by the number of vaccines administered by date. So, for instance, a state that administered more vaccines on any given day will be shaded darker than a state who administrated less vaccines. And, I will plan to animate the map over date, so that we can see a time-lapse of how vaccine administration changed overtime by state.

I'm also considering looking at vaccinations types (ie. Moderna, etc.), but am unsure if I will continue with that idea.

  1. Clara For my part, I'm interested in the narrative that conservative states are receiving more vaccines than their citizens are willing to use. To investigate the claim, I'll look at the CDC dataset about received vs administered vaccinations. Here, we'll arrive at my first visualization. I'm planning to calculate the ratio of used vs. wasted vaccines in every state. Then, I'll merge that with information about 2020 election results, classifying states as either strong Democrat, moderate Democrat, moderate Republican, or strong Republican, and use a visualization to see if there are meaningful differences in the ratio (likely a boxplot).

Next, I want to create a map of the United States using largely the same information, to show the ratios visually. It will be colored by ratio strength, and by political election result.

Once that's done, if there is still time, I'm planning to take a look at the Tennessee county by county data, to see if there's a difference between vaccination rates on a local level. Here, I'm considering seeing if more "rural" and, therefore, stereotypically more conservative areas have lower per capita vaccination counts than traditionally liberal cities. This will probably take the form of a scatterplot, with population on one axis and vaccinations on the other.

Process See questions. In the end, we plan to use a combination of boxplots, maps, dynamic maps, line graphs, and scatterplots to display information. We likely will also use text, and other fun visuals to keep the blog lively.

Status Updates

  1. Data Wrangling For the first update, on 5/4, we want all data wrangling to be complete.
  2. Blog Parts Finished Here, on 5/11, everyone will have finished their individual parts.
  3. Complete Project / Presentation Recorded By 5/13, the project should be complete and the presentation recorded.
katcorr commented 3 years ago

This sounds like an interesting blog I'm excited to read!

Update 1: 10/10

cpage23 commented 3 years ago

This sounds like an interesting blog I'm excited to read!

Update 1: 10/10

Here's our second update!

Amaya: Our team’s plan was to get all data wrangling for our final project done by this first status update. Fortunately, I was able to achieve this goal. I have two datasets (one for California, one for Tennessee) that show the proportion of each race that has been vaccinated over time. During the process, I found that Tennessee’s racial categories weren’t as detailed as California’s – but comparing the two graphs is still interesting and effective. I ended up just graphing this data over time to ensure I did the wrangling correctly, so that puts me a step ahead. I also decided to create four county-wise maps (two for CA, two for TN, where each state has one map from Feb 15 and one from April 15) showing vaccine counts. I think it will be interesting to visualize the difference in counts between those two dates and to see how the vaccine rollouts in rural areas pale in comparison to the urban areas.

Clara: I successfully created my dataset and I’m ready to begin work on my parts of the blog. This included a lot of busywork, such as renaming variables, moving to lowercase, and filtering. However, there were some more challenging parts. I had to gather then spread one dataset in order to get it in the form I wanted. I also had to combine two datasets. As of now, I feel good about beginning the next stage of the project with my new dataset, which is a combination of presidential election data for the year 2020 and CDC data about the number of doses administered/received, as well as the amount of people who are fully vaccinated by state. I believe I met our goal for this checkpoint, which was to have the data wrangled.

Kriti: For my dataset, I was able to successfully filter out the states that we would be able to map alongside the US map object. Additionally, I am creating a new column in the dataset that will contain the population of each state, so that not only will I be able to map the progression of total vaccine numbers per state, but also the percentage of state populations that have gotten vaccines that day. As of now, I am able to create a map for a single day across the United States with all of the states colored in by vaccine numbers, and am working on figuring out how to incorporate gganimate to allow the timelapse to happen. Additionally, because I am having trouble with editing access for the blog project repository, I will be making and pushing changes to the shiny repository for the time being.

katcorr commented 3 years ago

@cpage23 @asmole22 @krverma22

Great, sounds like you're all on track with the schedule you originally set!

Kriti, I revoked your invitation and then re-invited you to the blog repo -- it no longer says "pending invitation" so maybe it's all set? Could you try pushing something to the blog repo again? I'm also still working on the gganimation of the map (I've narrowed it down to an issue with geom_polygon specifically, but will push what I have soon so you can see where it's at. Hopefully I'll figure it out by end of today!

Update 2: 5/5

krverma22 commented 3 years ago

@katcorr

Hi Professor Correia,

I think the invitation worked! I was just able to push my files from yesterday to the blog repository. Also, thank you so much for taking the time to look into the gganimate issue! I really appreciate your help. Hope you have a great evening!

Sincerely, Kriti

cpage23 commented 3 years ago

All visualizations can be found in the index file, or in our individual folders in separate files.

Amaya:

UPDATE: I was able to complete all of the visualizations proposed (see previous issue for details) and copy the code for them into the combined blog file. I rearranged the datasets and wrangled files so they are accessible from any device with access to the repo.

Clara:

UPDATE: I’ve completed what I expect to be my final four visualizations. I created extra visualizations with my countrywide dataset. As such, I’m officially not planning on looking at the countywide data, which I said was a possibility in my initial ideas. My visualizations include a two-paneled ggplot with maps of the US, a scatterplot, a color-coded map, and a boxplot.

Kriti:

UPDATE: Visualizations were completed, including an animated map of the United States. This completes my original plans to make visualizations.

katcorr commented 3 years ago

@cpage23 @krverma22 @asmole22

update 3: 5/5