raphaelvallat / pingouin

Statistical package in Python based on Pandas
https://pingouin-stats.org/
GNU General Public License v3.0
1.63k stars 139 forks source link

Mixed-ANOVA error #381

Closed berkbuzcu closed 1 year ago

berkbuzcu commented 1 year ago

I am running Mixed-ANOVA model on a data structured same as the example.

    550 n_rm = len(rm)
--> 551 n_obs = int(grp_with.count().max())
    552 grandmean = data[dv].mean()
    553 # Calculate sums of squares

ValueError: cannot convert float NaN to integer

When I retrace the calculation made here I acquire the max count as an integer no problem. Some internal calculation seems to turn the data into an empty group dataframe.

Any thoughts?

raphaelvallat commented 1 year ago

Hi,

Can you please share a minimal example (with data and code) to reproduce the example?

Thanks Raphael

berkbuzcu commented 1 year ago

Below is the data sample and the line of code I use. @raphaelvallat

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

type | rating | nid | category -- | -- | -- | -- A | 3 | 0 | A B | 3 | 0 | A C | 4 | 0 | A D | 4 | 0 | B E | 3 | 0 | B F | 5 | 0 | B G | 4 | 0 | C H | 3 | 0 | C J | 3 | 0 | C A | 2 | 1 | A B | 3 | 1 | A C | 4 | 1 | A D | 4 | 1 | B E | 8 | 1 | B F | 6 | 1 | B G | 4 | 1 | C H | 7 | 1 | C J | 9 | 1 | C

pg.mixed_anova(data=data, dv='rating', between='category', within='type', subject='nid')

raphaelvallat commented 1 year ago

Hi @berkbuzcu,

Apologies about the slow reply. The issue is that your data is ill-defined for a mixed ANOVA. If we convert it to a wide-format representation, this becomes clearly visible:

image

You need to have:

In other words, there shouldn't be any NaN in the above screenshot.

Thanks Raphael