Closed Ryo-F closed 4 years ago
Thanks for picking this up Ryo, we'll take a look!
To reproduce:
import random
random.seed(1)
from tableone import TableOne
fruit = ['apple','banana','orange','pineapple','lemon','durian','peach']
n = 4
fruit = [random.sample(fruit, n),
random.sample(fruit, n),
random.sample(fruit, n),
random.sample(fruit, n),
random.sample(fruit, n),
random.sample(fruit, n),
random.sample(fruit, n)]
df = pd.DataFrame(fruit)
df.columns = ['basket1','basket2','basket3','basket4']
df
t1 = TableOne(df, categorical = ['basket1','basket2','basket3','basket4'])
t1
df.loc[1:3,'basket2'] = None
df.loc[2:4,'basket3'] = None
t2 = TableOne(df, categorical = ['basket1','basket2','basket3','basket4'])
t2
In t2, rows with the nulls are missing.
Thanks again @Ryo-F. Fixed in version 0.6.0! I'll keep this issue open until we have added some tests to https://github.com/tompollard/tableone/blob/master/test_tableone.py.
Hi, I've been using Tableone til I ran into this problem,
When using
Tableone(categorical=categorical_columns)
, if categorical_columns contains more than two columns and each of them contains NaN rows, some rows would be deleted due to dropna() in tableone.py#L468I've created a PR to fix this problem, please take a look!