UCB-stat-159-s22 / hw07-Group21

hw07-group21 created by GitHub Classroom
MIT License
0 stars 1 forks source link

Potential Analysis Topics #2

Open amichaelsen opened 2 years ago

amichaelsen commented 2 years ago

This issue is for brainstorming possible analysis topics.

amichaelsen commented 2 years ago

Data Collection/Processing

Reproduction

New Analysis

Stephenouu commented 2 years ago

Some of the topics while looking at the data dictionary

  1. Comparing the different popularity/all of majors category employment/unemployment rate over time if we can extract the dataset from recent years. Or we can just compare the rate between 2010 and 2012.
  2. Higher level of education means better life quality (since there are variables about the water bills, fuel, transportation to work, medical care for low-income family, food stamp service) - data from PUMS
  3. which major would bring the best life quality?
  4. salary of one major change over 10 years? (use the 2010 data and 2020 data to compare)
amichaelsen commented 2 years ago

For comparisons over time there are two options:

amichaelsen commented 2 years ago

Questions:

Note: The FiveThiryEight analysis excludes people whose total earnings (not income) was negative. This seems to mostly include people with low income who are categorized as "self employed". I'm not sure if those people should be excluded in our analysis, we could try both or discuss which option is more appropriate.

Also, some fields when considered in a single year have very few graduates (e.g. petroleum engineering!) and these tend to give high variance (seen as both the highest and lowest earning groups). We could either group majors (by major category or a smaller subset) or eliminate majors with less than 10 (or some other cut off) people in the sample (note: 75% of majors from 2018 had at least 33 people, so 10+ should capture most majors)

amichaelsen commented 2 years ago

Other articles dealing with data: