AntoineSoetewey / statsandr

A blog on statistics and R aiming at helping academics and professionals working with data to grasp important concepts in statistics and to apply them in R. See www.statsandr.com
http://statsandr.com/
36 stars 16 forks source link

blog/chi-square-test-of-independence-by-hand/ #18

Closed utterances-bot closed 3 years ago

utterances-bot commented 3 years ago

Chi-square test of independence by hand - Stats and R

Test if two categorical variables are dependent via the Chi-square test of independence. See also how to compute it by hand and how to interpret the results

https://statsandr.com/blog/chi-square-test-of-independence-by-hand/

AntoineSoetewey commented 3 years ago

Comment written by Kuo Yao Hung on January 29, 2020 03:47:04:

A really nice article, thanks.

AntoineSoetewey commented 3 years ago

Comment written by Kuo Yao Hung on January 29, 2020 03:47:04:

A really nice article, thanks.

Comment written by Antoine Soetewey on January 29, 2020 08:58:47:

Glad you liked it Kuo!

manoj1123 commented 3 years ago

Dear Antonie. Thank you so much for the article. I really added more understanding to the concept. I had a small doubt, regarding the following statement. "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely"

My confusion is if the probability of such a difference is unlikely then there should be no relation between the two variable. Instead can I read the statement as "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely BY RANDOM CHANCE".

Thanks again. Thanks for your time.

AntoineSoetewey commented 3 years ago

Dear Antonie. Thank you so much for the article. I really added more understanding to the concept. I had a small doubt, regarding the following statement. "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely"

My confusion is if the probability of such a difference is unlikely then there should be no relation between the two variable. Instead can I read the statement as "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely BY RANDOM CHANCE".

Thanks again. Thanks for your time.

Thanks for your question.

You're right; if the test statistic is above the critical value (determined by the Chi-square table), it means that the probability of observing such a difference between the observed and expected frequencies is unlikely BY RANDOM CHANCE.

However, remember that the null and alternative hypothesis of the Chi-square test of independence are:

This means that: if the test statistic is above the critical value --> we reject the null hypothesis of independence because the probability of observing such a large difference between the expected and observed frequencies just by chance is small (i.e., the p-value is small) --> the 2 variables are dependent --> there is a significant relationship between the 2 variables.

Hope this helps. Let me know if it is still unclear.

Regards, Antoine

manoj1123 commented 3 years ago

I must say thak you so much for the clarity and response.

Wishing a very happy new year.

Best Regards, Manoj

On Sat, 2 Jan, 2021, 3:58 PM Antoine Soetewey, notifications@github.com wrote:

Dear Antonie. Thank you so much for the article. I really added more understanding to the concept. I had a small doubt, regarding the following statement. "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely"

My confusion is if the probability of such a difference is unlikely then there should be no relation between the two variable. Instead can I read the statement as "If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely BY RANDOM CHANCE".

Thanks again. Thanks for your time.

Thanks for your question.

You're right; if the test statistic is above the critical value (determined by the Chi-square table), it means that the probability of observing such a difference between the observed and expected frequencies is unlikely BY RANDOM CHANCE.

However, remember that the null and alternative hypothesis of the Chi-square test of independence is:

  • H0: the 2 variables are independent
  • H1: the 2 variables are dependent

This means that: if the test statistic is above the critical value --> we reject the null hypothesis of independence because the probability of observing such a large difference between the expected and observed frequencies just by chance is small (i.e., the p-value is small) --> the 2 variables are dependent.

Hope this helps. Let me know if you it is still unclear.

Regards, Antoine

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntoineSoetewey/statsandr/issues/18#issuecomment-753456989, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXPOMBIRE536MAW4TSSPJTSX3YNFANCNFSM4VQGYWPA .