Closed jtanggsd closed 4 years ago
A-ma-zing! Everything is on point, I have nothing to add regarding the code. Maybe I can offer my opinion on some of the questions you asked in the file:
Q1: Plot 1 indicates whether or not someone is Rent Burdened, but I don’t know if this says much theoretically. A1: You can also infer that if the monthly rent is more than 1/40 of the household income, the household is considered as burdened. 40 is the slope of the boundary between two sets of points.
Q2: How to create an axis break so that I can show millionaires on the same plot, without squishing axes? A2: It's not easy to do it in ggplot, but axis.break in plotrix package handles this very well. There's a function gap.plot from the same package.
Q3: How to override alpha in the legend so that the legend dots are not so transparent? A3: Try to play with guides(color = guide_legend(override.aes = list(alpha= x))). See if anything comes up.
Q4: Still need to look into how to read this (Plot 4). CD said the y-axis scale corresponds with distance from center? A4: True, the farther from the center, the higher income. Each dot represents one household.
Q5: Am I able to say that non-English-speaking households earn no more than 75% of what English-speaking households earn? Q5: To say that, you'd need to do a statistical analysis to see if these two numbers are statistically different. More specifically, you need to perform a two-sample test for equal means (t-test).
Q6: (plot 6) I wonder what the spike is between people who make 600,000 and people who make 700,000, Q6: Maybe you're onto something! If you're interested how you can use statistics and data viz to detect tax frauds, see Benford's law.
Q7: (plot 7a): I had a really hard time conceptualizing this plot and the relationship between the 2 variables. Q7: You can create a percent stacked bar plot (same as yours, only it's normalized to 100% and shows percentages on y-axis). That way, it's easy to see what percent of people are using food stamps.
Q8: How much less likely are these Households to be on Food Stamps? Can I calculate this by comparing the areas of the violins? Q9: Violin plot is actually a probability density function - it shows the probability of any given value. The area of the violin plot is always 100%.
Both the Rmd and html files are ready for grading. The html file can be viewed here: https://vis-2129-f2020.github.io/jtang-vis/Assignment01_JT
Thank you and looking forward to your feedback.