I was drawn to Kyle Cuilla's take on this week's tidy tuesday assignment. His visualization has a lot going on, so I've chosen to just focus on his visualization.
Code/Tools/ approaches we have seen in class that you saw used over the week:
His code looked very similar to a lot of what we learned last week. He of course used the pipe (%>%), I found numerous uses of mutate(), group_by(), summarize(), filter(), arrange(), and others as well.
Code/Tools/ approaches that you enjoyed or that surprised you that we have not seen in class
One thing I noticed right away with this dataset is how there were values in some columns like "tourney_finish" that were character values like "Champ", "1st", "2nd". I figured this would have to be sorted out if we wanted to analyze it, and I was intrigued by how he did this. I learned about a new argument called _casewhen which sits inside of mutate() that lets you do multiple if_else statements on a vector. For example, he would use it to say "if tourney_finish is equal to "Champ", then replace that with "Champion". He did this for all the other values in this variable. It seemed like a very neat was to rename entire vectors of data.
Data visualizations (figures) that you enjoyed
I liked a few things about this visualization.
1: Anytime you have so many different factors (like 20+ schools) it can be very easy to get lost or overwhelmed. By using a subset of teams, this allowed for the colour visualization to be clear and consistent. I found this easier to interpret.
2: I liked the minimalist approach to the graph elements.
Data Visualization (Figures) that could be improved (and how you would improve them)
Some things I would change:
1: Though I was drawn to this visualization, it does have a lot going on. I might choose to display only a few of the figures, so that it is more focused. For example, the bottom right-most figure i feel is a bit clustered, and i'm not sure that it really adds anything to these figures.
2: I'm would not have gone with the off-white tint that is present here (though it's possible this was accidental? If you've ever used the snipping tool on windows while you had a blue-light filter on, this is often the result).
3: With the line figure, I think I would have also added a few more years to the x axis labels (e.g. 1990, 1995, 2000). I would also have put the x axis labels at the bottom, as I feel that is where people will more naturally look for x axis labels (and this is usually what I see in scientific papers).
Side note
I viewed many other submissions as well, and found a great resource. I saw a few people talked about Cedric Scherer's visualization. I looked at this as well and when I went to his twitter page I found other visualization he had done, as well as this thorough guide to using ggplot.
His visualizations tend to be more on the artsy side of things, but sometimes this is the most compelling way to display the data. For example, i was blown away by this figure that he made:
I was drawn to Kyle Cuilla's take on this week's tidy tuesday assignment. His visualization has a lot going on, so I've chosen to just focus on his visualization.
Tweet Link: (https://twitter.com/kc_analytics/status/1315754152723701760/photo/1) Code Link: (https://github.com/kcuilla/Tidy-Tuesday/blob/main/2020_41/2020_41_NCAA_Tourney.R)
Code/Tools/ approaches we have seen in class that you saw used over the week:
His code looked very similar to a lot of what we learned last week. He of course used the pipe (%>%), I found numerous uses of mutate(), group_by(), summarize(), filter(), arrange(), and others as well.
Code/Tools/ approaches that you enjoyed or that surprised you that we have not seen in class
One thing I noticed right away with this dataset is how there were values in some columns like "tourney_finish" that were character values like "Champ", "1st", "2nd". I figured this would have to be sorted out if we wanted to analyze it, and I was intrigued by how he did this. I learned about a new argument called _casewhen which sits inside of mutate() that lets you do multiple if_else statements on a vector. For example, he would use it to say "if tourney_finish is equal to "Champ", then replace that with "Champion". He did this for all the other values in this variable. It seemed like a very neat was to rename entire vectors of data.
Data visualizations (figures) that you enjoyed
I liked a few things about this visualization.
1: Anytime you have so many different factors (like 20+ schools) it can be very easy to get lost or overwhelmed. By using a subset of teams, this allowed for the colour visualization to be clear and consistent. I found this easier to interpret.
2: I liked the minimalist approach to the graph elements.
Data Visualization (Figures) that could be improved (and how you would improve them)
Some things I would change:
1: Though I was drawn to this visualization, it does have a lot going on. I might choose to display only a few of the figures, so that it is more focused. For example, the bottom right-most figure i feel is a bit clustered, and i'm not sure that it really adds anything to these figures.
2: I'm would not have gone with the off-white tint that is present here (though it's possible this was accidental? If you've ever used the snipping tool on windows while you had a blue-light filter on, this is often the result).
3: With the line figure, I think I would have also added a few more years to the x axis labels (e.g. 1990, 1995, 2000). I would also have put the x axis labels at the bottom, as I feel that is where people will more naturally look for x axis labels (and this is usually what I see in scientific papers).
Side note I viewed many other submissions as well, and found a great resource. I saw a few people talked about Cedric Scherer's visualization. I looked at this as well and when I went to his twitter page I found other visualization he had done, as well as this thorough guide to using ggplot.
"A ggplot2 Tutorial for Beautiful Plotting in R" (https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/)
His visualizations tend to be more on the artsy side of things, but sometimes this is the most compelling way to display the data. For example, i was blown away by this figure that he made:
https://pbs.twimg.com/media/EhfhullWsAMoetO?format=jpg&name=4096x4096