stdlib-js / stdlib

✨ Standard library for JavaScript and Node.js. ✨
https://stdlib.io
Apache License 2.0
4.32k stars 438 forks source link

Getting unexepected results from chi2test #458

Closed cff3 closed 2 years ago

cff3 commented 2 years ago

Description

I'm evaluating chi2test. In some cases I'm getting results that differ from other chi square independence tests available on the web.

As a pattern it seems to me that chi2test is producing results expected if number of columns and number of rows are equal but give different results if there are more rows than columns or vice versa.

I'm using the following examples and implementations to get the results to compare to: a) the initial example is from Zed Statistics video on Chi Square Independence: https://www.youtube.com/watch?v=NTHA9Qa81R8&t=925s b) The other comparison results are calculated with the help of https://www.mathsisfun.com/data/chi-square-calculator.html and http://quantpsy.org/chisq/chisq.htm.

Am I doing something wrong? Or is there a bug in chi2test or in the other implementations and examples?

Related Issues

Related issues # , # , and # .

Questions

No.

Demo

No response

Reproduction

import chi2test from '@stdlib/stats-chi2test';
import array from '@stdlib/ndarray-array';

describe('chi2test', () => {

  it('should give the results expected ', () => {

    // example from Zed  Statistics
    const res1 = chi2test(array([
      [15, 30, 5],
      [20, 35, 15]
    ]));
    expect(res1.statistic).toBeCloseTo(2.845, 3); // fails. result from chi2test is 1.056

    const res2 = chi2test(array([
      [15, 30, 5],
      [20, 35, 15],
      [30, 45, 25]
    ]));
    expect(res2.statistic).toBeCloseTo(5.247, 3); // success.

    const res3 = chi2test(array([
      [15, 30, 5],
      [20, 35, 15],
      [30, 45, 25],
      [35, 50, 30]
    ]));
    expect(res3.statistic).toBeCloseTo(6.759, 3); // fails. result from chi2test is 5.539

    const res4 = chi2test(array([
      [15, 30, 5, 20],
      [20, 35, 15, 25],
      [30, 45, 25, 30],
      [35, 50, 30, 40]
    ]));
    expect(res4.statistic).toBeCloseTo(7.355, 3); // success.
  });
});

Expected Results

No response

Actual Results

No response

Version

"@stdlib/stats-chi2test": "0.0.6"

Environments

Node.js

Browser Version

No response

Node.js / npm Version

node.js 12.16.2

Platform

Mac OS X, Big Sur, 11.5.2

Checklist

github-actions[bot] commented 2 years ago

:tada: Welcome! :tada:

And thank you for opening your first issue! We will get back to you shortly. :runner: :dash:

kgryte commented 2 years ago

@Planeshifter You have thoughts on this?

Planeshifter commented 2 years ago

Thank you @cff3 for filing this issue!

Can confirm that due to an unfortunate indexing bug, the calculation of the test statistic could be off for non-square input matrices. Just published a patch (v0.0.7) for the @stdlib/stats-chi2test package that should now always return the expected results. Tests have been updated to confirm the results now match those of R for the two failing examples from above (see updated fixtures and tests).

kgryte commented 2 years ago

Thanks for pushing out a quick fix @Planeshifter!

cff3 commented 2 years ago

Thank you @Planeshifter and @kgryte . That was extraordinary quick. I can confirm with 0.0.7 results are as expected.

@Planeshifter, one more question before closing: There is no 0.0.7 for chi2gof. I did not started to test chi2gof yet but will do soon. Is chi2gof-0.0.6 eventually affected by the same issue?

kgryte commented 2 years ago

@cff3 I don't believe that chi2gof is affected by the same issue, but feel free to test on your end. 😄

cff3 commented 2 years ago

Thanks @kgryte. I'll close this one. If I find some unexpected results in chi2gof I'll open a new one. Thanks a lot for stdlib

kgryte commented 2 years ago

No problem, @cff3. Happy to help! 🙌