susanli2016 / Machine-Learning-with-Python

Python code for common Machine Learning Algorithms
4.25k stars 4.81k forks source link

DeprecationWarning, DataConversionWarning, NameErrors, FutureWarnings #6

Open marianoju opened 5 years ago

marianoju commented 5 years ago

Hej Susan,

I am trying to retrace your steps on this logistic regression. I have started with your your article “Building A Logistic Regression in Python, Step by Step” on DataScience+ and I am now working through the (latest commit of your) Jupiter notebook used to make that post.

I have tried to reproduce your results with a clone of your notebook.

I have replaced from sklearn.cross_validation import train_test_split with from sklearn.model_selection import train_test_split because of a DeprecationWarning in cell 1.

Cell 24 raises a DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)

In cell 32 there's a NameError: name 'classifier' is not defined. You have used logreg.score in cell 30. Replacing classifier with logreg in cell 32 works but obviously produces the exact results as in line 30. I am not sure what you trying to do here (using a different classifier?), or if this is just an accidental duplicate (after changing classifier to a more specific logreg).

In the last section titled “ROC Curvefrom sklearn import metrics” it looks to me like you (accidentally) converted some Python code to MarkDown. This code (cell 34) produces two FutureWarnings: pandas.tslib is deprecated and will be removed in a future version., one NameError: name 'clf1' is not defined and another NameError: name 'Y_test' is not defined.

Kind regards

marianoju commented 5 years ago

DataConversionWarning in cell 24 can be fixed by replacing data_final[y] with data_final[y].values.reshape((-1,)).

NameErrors in cell 34 can be fixed by replacing clf1 with logreg and Y_test with y_test.