Hi! Following the example in section 13.2.2 to perform logistic regression using sklearn on the acs_ny.csv dataset results in a ConvergenceWarning and doesn't produce an intercept nor coefficients that match those in the book:
To address this warning I increased the max number of iterations as follows:
lr.max_iter = 1000 # my default value was 100
But my intercept and coefficients still don't match those in the book.
Per the books guidance, ran the following commands to get
import numpy as np
values = np.append(results.intercept_, results.coef_)
names = np.append('intercept', predictors.columns)
coefs = pd.DataFrame(values, index = names, columns=['coefs'])
coefs
And these are the results I get:
coefs
or
intercept
-5.632904
0.003578
HouseCosts
0.000726
1.000726
NumWorkers
0.581870
1.789382
NumBedrooms
0.238619
1.269495
OwnRent_Outright
0.570278
1.768759
OwnRent_Rented
-0.692253
0.500447
FamilyType_Male Head
-0.330524
0.718547
FamilyType_Married
1.224612
3.402845
Very different from those in the book:
coef
or
intercept
-5.492705
0.004117
HouseCosts
0.000710
1.000710
NumWorkers
0.559836
1.750385
NumBedrooms
0.222619
1.249345
OwnRent_Outright
1.180146
3.254851
OwnRent_Rented
-0.730046
0.481887
FamilyType_Male Head
0.318643
1.375260
FamilyType_Married
1.213134
3.364012
I wouldn't be as thrown off if the differences were a few decimal points or so, but my results assign substantially less weight to OwnRent_Outright and to FamilyType_Male Head and I have no idea why...
PS - I'm have VERY little experience with statistics and data science, my background is computer science.
Hi! Following the example in section 13.2.2 to perform logistic regression using sklearn on the
acs_ny.csv
dataset results in aConvergenceWarning
and doesn't produce an intercept nor coefficients that match those in the book:To address this warning I increased the max number of iterations as follows:
But my intercept and coefficients still don't match those in the book.
Per the books guidance, ran the following commands to get
And these are the results I get:
I wouldn't be as thrown off if the differences were a few decimal points or so, but my results assign substantially less weight to
OwnRent_Outright
and toFamilyType_Male Head
and I have no idea why...PS - I'm have VERY little experience with statistics and data science, my background is computer science.