jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
59 stars 29 forks source link

[Feature Request]: Naming of residuals in contingency tables in 016.4 #1847

Closed PerPalmgren closed 1 year ago

PerPalmgren commented 2 years ago

Description

Naming of residuals

Purpose

No response

Use-case

No response

Is your feature request related to a problem?

Why has the naming for the residuals (by the way a wonderful feature in this version) been unstandardized, Pearson and Standardized residuals? This is very confusing!!!! As the SPSS nomenclature is residuals, standardized residuals and adjusted residuals. So, in JASP Pearson´s residuals is the same as standardized residuals in SPSS? and in JASP standardized residuals is the same as adjusted residuals in SPSS? This is not optimal. See prinstscreen from SPSS and JASP below.

Describe the solution you would like

Use more similar naming/nomenclature as SPSS or Unstandardized, Standardized and Adjusted standardized

Describe alternatives that you have considered

No response

Additional context

Namnlös

Residulas.zip

Kucharssim commented 2 years ago

Dear @PerPalmgren,

I agree that the difference between JASP and SPSS is confusing for SPSS users, but we do have a reason to do this.

Consider that Agresti (2019) calls the residuals that we call standardized the same as we do in JASP:

Screenshot 2022-10-04 at 21 13 15

And also provides a reason why Pearson residual is called that way (i.e., it is applicable to GLMs under the same name and directly leads into the Pearson Chi-square statistic):

Screenshot 2022-10-04 at 21 12 08

Our use of the terms is also consistent with the base function in R for doing a chi-square test.

I understand that it is frustrating to see the confusion with the terminology in SPSS. However we are not trying to imitate whatever SPSS is doing, but rather try to make our own judgement. In cases like these (where I would argue SPSS itself deviates from other sources of literature and software packages), we will sometimes happen to deviate from it.

That said, I think it would be reasonable to explain this difference in the help file, so that the difference is clearer to users who are used to the terms used in SPSS. Would that be helpful?

PerPalmgren commented 2 years ago

Hi Simon,

Thank you for your swift and thourugh reply. I understand this choice but maybe for clarity for JASP users it could rather be called Pearson residual and Standardized Pearson residual.

All the best

Per

Per J. Palmgren | DC, FRCC, MMedEd, PhD | Associate Professor | Director of doctoral studies Department of Learning, Informatics, Management and Ethics | Karolinska Institutet | Tomtebodavägen 18A | 171 77 Stockholm | Tel. + 46 8 524 85 294<tel:+%2046%208%20524%2085%C2%A0294> @.**@.> | ki.sehttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fki.se%2F&data=04%7C01%7Cper.palmgren%40ki.se%7C9f6969aac91f4526542308d875898009%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637388580700328145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=f6OY6fEm6z0scmnO%2FDhlWRgL%2BujfjA072A2uHaASfNY%3D&reserved=0


Karolinska Institutet – a medical university

4 okt. 2022 kl. 21:34 skrev Simon Kucharsky @.***>:



Dear @PerPalmgrenhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPerPalmgren&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XFDun8VnhL%2Fa6D9wp9aMzMyje%2F%2FMUQdqmFj5DEH%2Fdpc%3D&reserved=0,

I agree that the difference between JASP and SPSS is confusing for SPSS users, but we do have a reason to do this.

Consider that Agresti (2019)https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wiley.com%2Fen-us%2FAn%2BIntroduction%2Bto%2BCategorical%2BData%2BAnalysis%252C%2B3rd%2BEdition-p-9781119405283&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qz4c50Sn2tH43JiJwz7H4faCylwVJ9rn3nF46OedrJ4%3D&reserved=0 calls the residuals that we call standardized the same as we do in JASP:

[Screenshot 2022-10-04 at 21 13 15]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F7093546%2F193906546-104a319c-a497-48b3-837f-09de37156083.png&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BaAhN4lXsovPfOCawN3X%2B27VW4e%2FIQdqJ0M879X7dL0%3D&reserved=0

And also provides a reason why Pearson residual is called that way (i.e., it is applicable to GLMs under the same name and directly leads into the Pearson Chi-square statistic):

[Screenshot 2022-10-04 at 21 12 08]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F7093546%2F193906814-26110712-f049-4666-b2cc-61a69cc7fdca.png&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=10DjQcoMH6UNpM0pwNYNmK2PxwT9UpQzQl%2BlenHR%2F78%3D&reserved=0

Our use of the terms is also consistent with the base function in R for doing a chi-square testhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rdocumentation.org%2Fpackages%2Fstats%2Fversions%2F3.6.2%2Ftopics%2Fchisq.test&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YyxGoVYs1PeW0rLdpa4%2B2%2FPmqYR7yOF8o5TMJOgZbyg%3D&reserved=0.

I understand that it is frustrating to see the confusion with the terminology in SPSS. However we are not trying to imitate whatever SPSS is doing, but rather try to make our own judgement. In cases like these (where I would argue SPSS itself deviates from other sources of literature and software packages), we will sometimes happen to deviate from it.

— Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjasp-stats%2Fjasp-issues%2Fissues%2F1847%23issuecomment-1267487834&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=58zZw2JH6OORX4HOBHwspQ4eU8eE%2F%2FTMhryLFNXrYMo%3D&reserved=0, or unsubscribehttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAS5YGHU545CGYNTK24VHVCLWBSBEJANCNFSM6AAAAAAQ43JA4Y&data=05%7C01%7Cper.palmgren%40ki.se%7Cb97ae0025e3d4714426508daa63f74bb%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638005088727631237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PU%2FtHRQC78qmwQ85Dh2mNWHdXGjGY9JkEDZ3PpA84M4%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifterhttps://ki.se/medarbetare/integritetsskyddspolicy.

Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data herehttps://ki.se/en/staff/data-protection-policy.

PerPalmgren commented 1 year ago

Let me propose to call it (see reference below)

PerPalmgren commented 1 year ago

Any thoughts on my last suggestion?

PerPalmgren commented 1 year ago

@Kucharssim any thoughts on my suggestion for naming?

PerPalmgren commented 1 year ago

@Kucharssim @juliuspfadt I still think my suggestion with regards to the naming of residuals and the referens provided would make things more clear for JASP users. @Kucharssim the reference I provide also refer to the calculations by Agresti. Kind regards Per

PerPalmgren commented 1 year ago

Any response?

Kucharssim commented 1 year ago

Hi @PerPalmgren,

sorry, your earlier response slipped my attention.

I will try to summarise. Please correct me if I am wrong.

Currently we have these types of residuals in JASP:

  1. Unstandardized residuals: calculated as observed - expected
  2. Pearson residuals: calculated as (observed - expected) / sqrt(expected)
  3. Standardized residuals: calculated as (observed - expected) / sqrt(V), where V is the residual error variance.

This is consistent with terminology used in R and other analyses like Poisson GLM as discussed in the Agresti reference.

SPSS uses the following names:

  1. Residuals
  2. Standardized residuals
  3. Adjusted residuals

You propose to change the naming in the following way:

  1. Residuals
  2. Pearson residuals
  3. Adjusted Pearson residuals

I admit I am still uncertain about this proposal as I would prefer to keep consistency with other JASP analyses and with references like Agresti rather than keeping consistency with SPSS. I also see a value in calling the third type of residuals as "standardized" as they follow the standard normal distribution under the null hypothesis.

Further, if I understand correctly, the reference you provided uses three different names for the third type of residuals: "adjusted Pearson residuals", "adjusted standardized residuals", and "standardized residuals" - the latter is also used by JASP.

I do understand that the clash between what JASP calls "standardized" and what SPSS calls "standardized" is unfortunate, but I still think this could be clarified for SPSS users in the help file. Alternatively, it would be also possible to do something like

  1. Unstandardized residuals
  2. Pearson residuals
  3. Standardized (adjusted Pearson) residuals

That's perhaps the best of the two worlds?

As this is mostly a matter of preference and conventions, perhaps @EJWagenmakers would want to give his opinion?

PerPalmgren commented 1 year ago

Hi Simon, I do understand the dilemma and I think your last proposal is a very good compromise and “the best of two worlds”. Great😜👍 Per

Per J. Palmgren | DC, FRCC, MMedEd, PhD | University Lecturer | Department for Learning, Informatics, Management and Ethics | Karolinska Institutet | Tomtebodavägen 18A | 171 77 Stockholm | Tel. + 46 8 524 85 294


Karolinska Institutet – a medical university From: Simon Kucharsky @.> Sent: den 24 maj 2023 10:39 To: jasp-stats/jasp-issues @.> Cc: Per Palmgren @.>; Mention @.> Subject: Re: [jasp-stats/jasp-issues] [Feature Request]: Naming of residuals in contingency tables in 016.4 (Issue #1847)

Hi @PerPalmgrenhttps://github.com/PerPalmgren,

sorry, your earlier response slipped my attention.

I will try to summarise. Please correct me if I am wrong.

Currently we have these types of residuals in JASP:

  1. Unstandardized residuals: calculated as observed - expected
  2. Pearson residuals: calculated as (observed - expected) / sqrt(expected)
  3. Standardized residuals: calculated as (observed - expected) / sqrt(V), where V is the residual error variance.

This is consistent with terminology used in R and other analyses like Poisson GLM as discussed in the Agresti reference.

SPSS uses the following names:

  1. Residuals
  2. Standardized residuals
  3. Adjusted residuals

You propose to change the naming in the following way:

  1. Residuals
  2. Pearson residuals
  3. Adjusted Pearson residuals

I admit I am still uncertain about this proposal as I would prefer to keep consistency with other JASP analyses and with references like Agresti rather than keeping consistency with SPSS. I also see a value in calling the third type of residuals as "standardized" as they follow the standard normal distribution under the null hypothesis.

Further, I see that the reference you provided uses three different names for the third type of residuals: "adjusted Pearson residuals", "adjusted standardized residuals", and "standardized residuals" - the latter is also used by JASP.

I do understand that the clash between what JASP calls "standardized" and what SPSS calls "standardized" is unfortunate, but I still think this could be clarified for SPSS users in the help file. Alternatively, it would be also possible to do something like

  1. Unstandardized residuals
  2. Pearson residuals
  3. Standardized (adjusted Pearson) residuals

That's perhaps the best of the two worlds?

As this is mostly a matter of preference and conventions, perhaps @EJWagenmakershttps://github.com/EJWagenmakers would want to give his opinion?

— Reply to this email directly, view it on GitHubhttps://github.com/jasp-stats/jasp-issues/issues/1847#issuecomment-1560690451, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AS5YGHVKRUN7OVVA5IFRGO3XHXCITANCNFSM6AAAAAAQ43JA4Y. You are receiving this because you were mentioned.Message ID: @.**@.>>

När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifterhttps://ki.se/medarbetare/integritetsskyddspolicy.

Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data herehttps://ki.se/en/staff/data-protection-policy.

JaspBoy commented 5 months ago

Can anyone help me distinguish something then - when would one use adjusted pearson residuals instead of pearson residuals since the JASP guide book showcase using the pearson residuals not adjusted ones?