tom-hc-park / STAT550-450-for-Seniorworkers-from-Korea

0 stars 0 forks source link

The correlation between 4 dependent variables #10

Open xinyaofan opened 6 years ago

xinyaofan commented 6 years ago

Hi, @ekroc I calculated the correlation coefficients between num_use & liter_use; liter_use & pvlitM; num_use & pvnumM, the values are as follows: 0.6921877,0.423401,0.3686938. That means our 4 response variables are related. Maybe something like that: if someone's literacy usage is large, then it's very likely he can get a higher score in the test. Do you think we should still construct 4 independent models? Or can we try to combine the information in our dependent variables to reduce the number of response variables? Or just ignore some dependent variables?

xinyaofan commented 6 years ago

@NSKrstic Sorry, I mentioned people wrongly!.. Nikolas, could you please look at the above questions? Thank you!

NSKrstic commented 6 years ago

@aiod01 (don't forget to tag Tom too)

Well we're focusing on "num_use" and "liter_use". Our goal is to make independent models for those. Although we can see correlation between those responses, it's not extremely high. Trying to further aggregate the variables (when they're already averages of likert scales) is probably a bad call. It's something to discuss though in our report, that the responses are correlated (as expected based on the nature of the measures).

NSKrstic commented 6 years ago

STAT 450 will focus on modelling the other two responses (proficiency scores), as we decided before.