ageron / handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
27.76k stars 12.73k forks source link

Chapter 2 Question 5 Score failed #521

Open SichengYang opened 2 years ago

SichengYang commented 2 years ago

I open Jupiter Notebook and run the program. It said score failed with lots of errors. Everything before seems works fine.

I do not know how to solve it. Thanks!

Screen Shot 2022-01-04 at 11 19 49 PM Screen Shot 2022-01-04 at 11 20 12 PM

ageron commented 2 years ago

Hi @SichengYang , Thanks for your question. These errors are displayed but training continues, right? Do you see ValueError: Found unknown categories ['ISLAND'] in column 0 during transform? If so, then I believe the problem is not very important, it's just a warning, let me explain:

The ISLAND category is rare in the training set. So during K-fold cross-validation, it's possible for all instances of the ISLAND category to end up in the validation set, with none at all in the training set. This means that ISLAND will be considered an unknown category when the model gets evaluated against the validation set, which causes an error. That's not a big deal, it just means that we're missing out on a few hyperparameter combinations. You can ignore this problem, or if you prefer to avoid it completely, you need to tell the OneHotEncoder to ignore unknown categories rather than raise an error (which is the default behavior). This can be done by setting the handle_unknown = 'ignore' when creating the OneHotEncoder.

I hope this helps.

SichengYang commented 2 years ago

Thank you!