pydata / patsy

Describing statistical models in Python using symbolic formulas
Other
944 stars 103 forks source link

dmatrices raising “AssertionError” #148

Open RandomForestRanger opened 5 years ago

RandomForestRanger commented 5 years ago

Totally inexperienced user. My first Negative Binomial Regression. iPython on Google's Colab. I load the dataset as a pandas df. The features (and Target) in the formula below all appear in the df (which I named "dataset").

I also bring in

from patsy import dmatrices
import statsmodels.api as sm

however, when I

formula = """Target ~ MeanAge   + %White + %HHsNotWater + HHsIneq*10    + %NotSaLang + %male + %Informal + COGTACatG2B09 + %Poor + AGRating  """
data = dataset

response, predictors = dmatrices(formula, data, return_type='dataframe')
nb_results = sm.GLM(response, predictors, family=sm.families.NegativeBinomial(alpha=0.15)).fit()
print(nb_results.summary())

I simply get "AssertionError: ", and an arrow to line four (the one starting "response"). I have no idea how to remedy this, and cannot figure out if why this happens. Is it a Patsy issue? a Colab issue? A daft coding issue? Any sage guidance, please?

njsmith commented 5 years ago

Sounds like your environment is messed up somehow – python errors should always have more details than that!

I don't know what's causing it, but some issues jump out at me in your formula:

RandomForestRanger commented 5 years ago

@njsmith - that was it! After fixing the formula, all fell into place. Many thanks.

bhaktatejas922 commented 5 years ago

I have a similar issue. When doing this I get a blank assertion error as well

`outcome_1, predictors_1 = patsy.dmatrices("Q('%received 18+') ~ n_killed_capita_2016", aggregate_2016_df)

mod_1 = sm.OLS(outcome_1, predictors_1)

res_1 = mod_1.fit()

print(res_1.summary())`