Closed AndersBennedsgaard closed 5 years ago
Sorry you're having issues with the code. This seems very strange. Is that a reproducible error? You're running out of memory predicting on a tiny dataset. How much (free?) ram do you have and which version of scikit-learn and mglearn are you using?
Well, no matter what IDE I use, or if I run it via the CMD, it still gives the error. Also the later plots, fig. 2.52-53, gives the error. I have 16 gb installed atm, and task manager says I have 8.3 gb not being used. I would try it on another computer, but I have none here with me. mglearn version 0.1.6 and scikit-learn version 0.19.1 using pip.
Sorry for the slow reply, I was out over the weekend. I'm not entirely sure what's happening here. Could you please give the types and shapes of a
and b
in the line that gives the memory error? (You can run it with python -m pdb
or use the %debug
magic in ipython or jupyter)
Well, sorry for the incredibly slow reply, i had some exams to pass.. ;) I'm using PyCharm, and using its debug feature i can get a lot of information. a: dtype = float64, shape = (1000000, 2) tuple b: dtype = float64, shape = (2, 100) tuple
Thanks for the comment. The way it is written now it creates a 1000000 x 100 matrix, which takes 762MB of RAM (according to my calculations at least ;). That's maybe not ideal. But it should work on your machine. Unless you have 32bit python installed and your whole python process is limited to 2gb of ram.
After having upgraded to 64bit Python, the program works perfectly. I used 32bit since i've heard some packages work better with that, but it might just be easier to use 64bit with machine learning scripts :)
I would be surprised if 32bit worked better for anything.
I have the same problem,my pyhton version is as fallowing: Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1 900 64 bit (AMD64)] on win32. mglearn version 0.1.6 and scikit-learn version 0.19.1 using pip. what should I do to solve the problem? I don't kown how to upgraded to 64bit Python? or I just need to add ram?
it looks like you have 64bit python. How much ram do you have? I'll try to post a fix soon, but if you have a reasonable amount of ram it should be fine. The code really should be rewritten so as not to need as much ram.
I stumbled over the same problem looking at the sklearn.neural_network.MLPClassifier there is no decision_function() - only a predict_proba() function. In many code snippets (even in your own @amueller I find the following code:
if hasattr(clf, "decision_function"):
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
else:
Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]
... ie depending on the classifier .... so should that distinction be in the mglearn package.
Sorry ... ignore my last comment - it was related to a memory constraint as well.
The "real" fix for this would be to use "working_memory" in the MLP in scikit-learn, but maybe we want a quick fix here? Working on the next sklearn release right now. @behreth if you want to send a PR to do some chunking in plot_2d_separator let me know ;)
Thanks @amueller for the hint and reference. I am not an expert in Python's memory management in the context of try/catch, but I take a look over the weekend.
The problem is that we can't usually try/catch, we have do be defensive. Once the memory is full, usually everything is going down. What scikit-learn is doing is basically "if this requires allocating more than 1gb, break it into chunks instead".
The "real" fix for this would be to use "working_memory" in the MLP in scikit-learn, but maybe we want a quick fix here? Working on the next sklearn release right now. @behreth if you want to send a PR to do some chunking in plot_2d_separator let me know ;)
Pull request is in, ... be nice to a seasoned Assembler and C++ programmer learning Python ;)
fixed in #106, thanks @behreth !
Hello I've gotten to the neural network part of supervised machine learning, and there is a slight problem with the classification, using MLPClassifier and mglearn.plots.plot_2d_seperator. On page 110, it says
and using the plot_2d_seperator, it gives me an error:
This can be fixed by setting hidden_layer_sizes=53 in MLPClassifier. Using more hidden layers, the error shows again.