stat231-s23 / blog2-SunnyDay

blog2_SunnyDay_Tabitha_Emma_Lynca
0 stars 0 forks source link

Blog Plan. #1

Open gukama22 opened 1 year ago

gukama22 commented 1 year ago
  1. Our final project will build upon the shiny app and will focus on analyzing COVID-19-related data at the global level. We plan to analyze various observations, such as case counts, vaccination rates, per capita deaths, hospitalizations, and more. We will try and reproduce our statewide and county-wide visualizations on a world level (so worldwide trends and then broken down by country). Additionally, we will incorporate spatial data and text analysis (to compare for instance how official announcements regarding COVID-19 vary across different countries, and to analyze the sentiment among each country's population). We aim to explore the concepts learned in class to derive meaningful insights from the data. We will use the Covid-19 Dataset from Our World in Data for most of this analysis, although we may need some additional data sources.

  2. The blog will have a Shiny component, an interactive map and at least two more visualizations that will provide answers to the questions outlined above. Some of these visualizations may be similar to our midsemester visualizations, but on a worldwide scale rather than statewide.

  3. Schedule:

    • Sunday 4/16: Settling on data sets.
    • Tuesday 4/18: Dividing tasks and starting data wrangling
    • 4/20: Status Update 1
    • 4/24: Data Wrangling Due (Among Ourselves) + Start making the Visualization
    • 4/27: Status Update 2
    • 5/1: In group Presentation to other members + Bringing all the components together
    • 5/4: Presentation/feedback session in class
katcorr commented 1 year ago

Sounds good! Have you seen datasets that have collated text around "official announcements regarding COVID-19 across different countries" or are you thinking you might try to webscrape for that text data?

I look forward to your blog!

Proposal: 10/10

ewstrawbridge commented 1 year ago

Here's my update! I want to do some sort of clustering with different information on countries like healthcare coverage, population, poverty numbers, and maybe some more data as well as case counts. Ideally I'd like to see if I can figure out what characteristics countries have that took the shortest amount of time to reduce covid levels to a number that I'll pick. Whichever type of variable is most convincing I will likely use in the visualization, or I'll have users be able to pick and choose. I'm looking into data now to make sure that I have the time frame ability for that and also whether or not I have variables in enough levels to make that work.

gukama22 commented 1 year ago

I am considering displaying a graphical representation of two observations and how they relate to each other. Specifically, I was thinking of creating an interactive map that would allow users to select, for a region of the world (e.g., a continent or the entire world) and a non-COVID-realted observation (such as the top 10 countries by mortality rate or highest (or lowest) GDP per capita, or population density) in order to display the COVID-related statistics for those countries. The interactive feature would include highlighting the countries of interest and showing relevant information, such as their name. Additionally, I plan to incorporate another level of interactivity by plotting the COVID-related data for the selected countries on a separate graph line.

While I believe that the initial map interactivity is relatively straightforward, the biggest challenge will be implementing the second graph based on the countries selected from the map.

tabicatt commented 1 year ago

I'm planning on doing an analysis of the literature released about covid during the height of the pandemic (approximately 2020-2021). I was able to find a dataset with information about publications available on Dimension, which includes country of origin data (which was hard to find!). I'm hoping to create a visualization that compares the number of publications from specific countries or research organizations (selectable). I also might incorporate the specific publication type (article, preprint, etc) and whether it is open access. The main component will likely be either an interactive bar graph or lollipop chart, but I might create an interactive map visualization for looking at the publications by country (whichever conveys the data more clearly). I'm also entertaining the idea of a word analysis of titles to look at trends in publication topics.

katcorr commented 1 year ago

Note that even if you incorporate a shiny app (or apps) into your blog post, the end deliverable is still your blog post website: https://stat231-s23.github.io/blog2-SunnyDay/

(which is controlled by the index.Rmd and index.html files in your repo, which are currently template examples) so interactive apps can be incorporated but are not required in the blog post -- you'll need to write up interpretations etc. so it may be better to focus on a few compelling static visualizations rather than allowing the user to explore a lot of different things in an app but then have no coherent story to tell.

Update 1: 5/5

tabicatt commented 1 year ago

After getting over the initial wrangling issue with the empty values (thank you!!!) I was able to finish both my wrangling and visualizations! As of now, I have 2 visualizations, both of which have a bit of interactivity with plotly (just specific mouseover info) looking at publications by country and common title topics. I've started working on and thinking about my actual blog post writing and analysis, and I'm sure as I develop what exactly it is I want to say I'll end up going back and tweaking my visualizations, etc. But currently, I think I'm a bit ahead of the schedule we set for ourselves which I'm pretty happy about!

ewstrawbridge commented 1 year ago

OK, so the wrangling worked fine and my code runs, it's just not really as exciting as I wanted. I couldn't get poverty or employment numbers for enough countries to cover the majority, so I'm working just with the OWID data. It's totally fine I'm just finding some kind of boring-to-visualize results and what I'm grouping by is a little weird. I want to determine which factors are correlated with countries taking a long time to "recover" from covid so I figured out how long it took for them to get down below 1 death per million, in 2021 (because I felt like 2020 was all over the place and especially getting through the first winter was hard). Anyway it's a little weird so I'm still tinkering with it but stuff is certainly happening

gukama22 commented 1 year ago

I am running a bit behind on my original plan because, at this point, I haven’t done much work on the visualizations. I have completed most of the data wrangling (as usual I anticipate revisiting the data frame to add more informative observations or remove some). The reason for my delay is a combination of not being able to use R desktop over the weekend (because I found out that the data set was too big to get uploaded on the server, so I practically did nothing over the weekend) and my prioritizing other things over the past couple of days. To meet my deadline of having the visualization done by Monday May1st, I need to dedicate more time to it everyday. My plan is to focus on developing the Shiny component (establishing a leaflet map to line graph interactivity) and how to integrate the visualization into my blog, and making it more cohesive.

katcorr commented 1 year ago

Ok! Update 2: 5/5