amueller / introduction_to_ml_with_python

Notebooks and code for the book "Introduction to Machine Learning with Python"
7.39k stars 4.55k forks source link

Problem/bug with MLPClassifier using neural network (P. 110, third release) #67

Closed AndersBennedsgaard closed 5 years ago

AndersBennedsgaard commented 6 years ago

Hello I've gotten to the neural network part of supervised machine learning, and there is a slight problem with the classification, using MLPClassifier and mglearn.plots.plot_2d_seperator. On page 110, it says

X, y = make_moons(n_samples=100, noise=.25, random_state=3)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42)

mlp = MLPClassifier(solver='lbfgs', random_state=0).fit(X_train, y_train)
mglearn.plots.plot_2d_separator(mlp, X_train, fill=True, alpha=.3)
mglearn.discrete_scatter(X_train[:, 0], X_train[:, 1], y_train)

and using the plot_2d_seperator, it gives me an error:

Traceback (most recent call last):
  File "D:\Program Files\...\mglearn\plot_2d_separator.py", line 86, in plot_2d_separator
    decision_values = classifier.decision_function(X_grid)
AttributeError: 'MLPClassifier' object has no attribute 'decision_function'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/Program Files/.../chap2_16.py", line 22, in <module>
    mglearn.plots.plot_2d_separator(mlp, X_train, fill=True, alpha=.3)
  File "D:\Program Files\...\mglearn\plot_2d_separator.py", line 92, in plot_2d_separator
    decision_values = classifier.predict_proba(X_grid)[:, 1]
  File "C:\Users\...\sklearn\neural_network\multilayer_perceptron.py", line 1050, in predict_proba
    y_pred = self._predict(X)
  File "C:\Users\...\sklearn\neural_network\multilayer_perceptron.py", line 678, in _predict
    self._forward_pass(activations)
  File "C:\Users\...\sklearn\neural_network\multilayer_perceptron.py", line 105, in _forward_pass
    self.coefs_[i])
  File "C:\Users\...\sklearn\utils\extmath.py", line 140, in safe_sparse_dot
    return np.dot(a, b)
MemoryError

This can be fixed by setting hidden_layer_sizes=53 in MLPClassifier. Using more hidden layers, the error shows again.

amueller commented 6 years ago

Sorry you're having issues with the code. This seems very strange. Is that a reproducible error? You're running out of memory predicting on a tiny dataset. How much (free?) ram do you have and which version of scikit-learn and mglearn are you using?

AndersBennedsgaard commented 6 years ago

Well, no matter what IDE I use, or if I run it via the CMD, it still gives the error. Also the later plots, fig. 2.52-53, gives the error. I have 16 gb installed atm, and task manager says I have 8.3 gb not being used. I would try it on another computer, but I have none here with me. mglearn version 0.1.6 and scikit-learn version 0.19.1 using pip.

amueller commented 6 years ago

Sorry for the slow reply, I was out over the weekend. I'm not entirely sure what's happening here. Could you please give the types and shapes of a and b in the line that gives the memory error? (You can run it with python -m pdb or use the %debug magic in ipython or jupyter)

AndersBennedsgaard commented 6 years ago

Well, sorry for the incredibly slow reply, i had some exams to pass.. ;) I'm using PyCharm, and using its debug feature i can get a lot of information. a: dtype = float64, shape = (1000000, 2) tuple b: dtype = float64, shape = (2, 100) tuple

amueller commented 6 years ago

Thanks for the comment. The way it is written now it creates a 1000000 x 100 matrix, which takes 762MB of RAM (according to my calculations at least ;). That's maybe not ideal. But it should work on your machine. Unless you have 32bit python installed and your whole python process is limited to 2gb of ram.

AndersBennedsgaard commented 6 years ago

After having upgraded to 64bit Python, the program works perfectly. I used 32bit since i've heard some packages work better with that, but it might just be easier to use 64bit with machine learning scripts :)

amueller commented 6 years ago

I would be surprised if 32bit worked better for anything.

minda163 commented 6 years ago

I have the same problem,my pyhton version is as fallowing: Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1 900 64 bit (AMD64)] on win32. mglearn version 0.1.6 and scikit-learn version 0.19.1 using pip. what should I do to solve the problem? I don't kown how to upgraded to 64bit Python? or I just need to add ram?

amueller commented 6 years ago

it looks like you have 64bit python. How much ram do you have? I'll try to post a fix soon, but if you have a reasonable amount of ram it should be fine. The code really should be rewritten so as not to need as much ram.

behreth commented 5 years ago

I stumbled over the same problem looking at the sklearn.neural_network.MLPClassifier there is no decision_function() - only a predict_proba() function. In many code snippets (even in your own @amueller I find the following code:

if hasattr(clf, "decision_function"):
        Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
    else:
        Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]

... ie depending on the classifier .... so should that distinction be in the mglearn package.

behreth commented 5 years ago

Sorry ... ignore my last comment - it was related to a memory constraint as well.

amueller commented 5 years ago

The "real" fix for this would be to use "working_memory" in the MLP in scikit-learn, but maybe we want a quick fix here? Working on the next sklearn release right now. @behreth if you want to send a PR to do some chunking in plot_2d_separator let me know ;)

behreth commented 5 years ago

Thanks @amueller for the hint and reference. I am not an expert in Python's memory management in the context of try/catch, but I take a look over the weekend.

amueller commented 5 years ago

The problem is that we can't usually try/catch, we have do be defensive. Once the memory is full, usually everything is going down. What scikit-learn is doing is basically "if this requires allocating more than 1gb, break it into chunks instead".

behreth commented 5 years ago

The "real" fix for this would be to use "working_memory" in the MLP in scikit-learn, but maybe we want a quick fix here? Working on the next sklearn release right now. @behreth if you want to send a PR to do some chunking in plot_2d_separator let me know ;)

Pull request is in, ... be nice to a seasoned Assembler and C++ programmer learning Python ;)

amueller commented 5 years ago

fixed in #106, thanks @behreth !