benmiroglio / pymatch

MIT License
272 stars 128 forks source link

Fixed error messages with pandas >0.23.4 as proposed by zottacko #17

Closed mc51 closed 4 years ago

mc51 commented 5 years ago

It would be great to have this working out of the box. Hence, I took @zottacko 's fix for solving the errors and warning messages with pandas versions > 0.23.4 and implemented them into a proper pull request. I checked with pandas 0.24.2 and it works.

Also, I checked that we get the same results for this sample toy data:

from sklearn.datasets.samples_generator import make_blobs
from pymatch.Matcher import Matcher
import pandas as pd
import numpy as np

np.random.seed(1)

X, y = make_blobs(n_samples=5000, centers=2, n_features=2, cluster_std=3.5)
df = pd.DataFrame(dict(x=X[:,0], y=X[:,1], label=y))
df['population'] = np.random.choice([1, 0], size=len(df), p=[0.8, 0.2])

control = df[df.label == 1]
test = df[df.label == 0]

m = Matcher(test, control, yvar="population", exclude=['label'])

m.fit_scores(balance=False, nmodels=10)
m.predict_scores()

/e: I pushed a new commit. It adds a minor version 0.3.4.1 containing the fix from the previous commit. Moreover, it re-built the package with setup-tools to incorporate the fixes into the build and dist so that it will work with pip install.

mc51 commented 5 years ago

Fixed #11 and #18

ktmud commented 4 years ago

This breaks compare_categorical and compare_continuous.

With this change, all variables are not deemed categorical.