rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.82k stars 853 forks source link

Value of `tb` is not reproduced in documentation of mcnemar_test #988

Open ftnext opened 1 year ago

ftnext commented 1 year ago

Describe the documentation issue

https://rasbt.github.io/mlxtend/user_guide/evaluate/mcnemar_table/#example-2-2x2-contingency-table

The value of tb is not reproduced.

array([[4, 1],
       [2, 3]])

This is the same code as the example of documentation

>>> import numpy as np
>>> from mlxtend.evaluate import mcnemar_table
>>> y_true = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
>>> y_mod1 = np.array([0, 1, 0, 0, 0, 1, 1, 0, 0, 0])  # R, W, R, R, R, R, R, W, W, W
>>> y_mod2 = np.array([0, 0, 1, 1, 0, 1, 1, 0, 0, 0])  # R, R, W, W, R, R, R, W, W, W
>>> tb = mcnemar_table(y_target=y_true, y_model1=y_mod1, y_model2=y_mod2)
>>> tb
array([[4, 2],
       [1, 3]])

(In comment, R stands for right, W stands for wrong.)

Environment:

mlxtend         0.21.0
numpy           1.23.3

I think the value of tb in the current documentation seems to be wrong.

Reasons:

b: tb[0, 1]: # of samples that model 1 got right and model 2 got wrong

tb[0, 1] should be 2 (indices: 2, 3)

c: tb[1, 0]: # of samples that model 2 got right and model 1 got wrong

tb[1, 0] should be 1 (indices: 1)

From https://github.com/rasbt/mlxtend/blob/v0.21.0/mlxtend/evaluate/tests/test_mcnemar_table.py#L70-L77 , tb should be np.array([[4, 2], [1, 3]])

Suggest a potential improvement or addition

Change to

array([[4, 2],
       [1, 3]])

The checkerboard plot needs to be changed too.

Scope of impact