VIS-2129-F2020 / jtang-vis

1 stars 0 forks source link

Assignment01 is ready for grading #1

Closed jtanggsd closed 4 years ago

jtanggsd commented 4 years ago

Both the Rmd and html files are ready for grading. The html file can be viewed here: https://vis-2129-f2020.github.io/jtang-vis/Assignment01_JT

Thank you and looking forward to your feedback.

bauranov commented 4 years ago

A-ma-zing! Everything is on point, I have nothing to add regarding the code. Maybe I can offer my opinion on some of the questions you asked in the file:

Q1: Plot 1 indicates whether or not someone is Rent Burdened, but I don’t know if this says much theoretically. A1: You can also infer that if the monthly rent is more than 1/40 of the household income, the household is considered as burdened. 40 is the slope of the boundary between two sets of points.

Q2: How to create an axis break so that I can show millionaires on the same plot, without squishing axes? A2: It's not easy to do it in ggplot, but axis.break in plotrix package handles this very well. There's a function gap.plot from the same package.

Q3: How to override alpha in the legend so that the legend dots are not so transparent? A3: Try to play with guides(color = guide_legend(override.aes = list(alpha= x))). See if anything comes up.

Q4: Still need to look into how to read this (Plot 4). CD said the y-axis scale corresponds with distance from center? A4: True, the farther from the center, the higher income. Each dot represents one household.

Q5: Am I able to say that non-English-speaking households earn no more than 75% of what English-speaking households earn? Q5: To say that, you'd need to do a statistical analysis to see if these two numbers are statistically different. More specifically, you need to perform a two-sample test for equal means (t-test).

Q6: (plot 6) I wonder what the spike is between people who make 600,000 and people who make 700,000, Q6: Maybe you're onto something! If you're interested how you can use statistics and data viz to detect tax frauds, see Benford's law.

Q7: (plot 7a): I had a really hard time conceptualizing this plot and the relationship between the 2 variables. Q7: You can create a percent stacked bar plot (same as yours, only it's normalized to 100% and shows percentages on y-axis). That way, it's easy to see what percent of people are using food stamps.

Q8: How much less likely are these Households to be on Food Stamps? Can I calculate this by comparing the areas of the violins? Q9: Violin plot is actually a probability density function - it shows the probability of any given value. The area of the violin plot is always 100%.