rasbt / python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource
MIT License
12.18k stars 4.39k forks source link

Chapter 2 (Rosenblatt Perceptron): "misclassifications per epochs" on p. 30 are misleading #27

Closed simon-clematide closed 7 years ago

simon-clematide commented 7 years ago

First things first: I absolutely like how you motivate, introduce and implement the relevant concepts in your book.

I think there is a problem with the Rosenblatt perceptron learning description (evaluation) as presented in the Figure on page 30 in the book. The errors that are counted in the variable errors are the number of updates that are performed in one epoch. However, this number does not represent the number of misclassifications after each epoch. For instance, if you use your standard options but train only for one iteration, there will be two updates ("2 errors" according to your terminology), however, all items will be classified as -1 (Setosas), therefore, there are 50 misclassification and this classifier's error rate is actually 50%.

image

rasbt commented 7 years ago

Thanks for the feedback! Yeah, the plot on page 30 may be a bit confusing, the "count" on the y-axis is not the error rate on the training set rather than the number of errors in a current epoch. E.g., in epoch 2, it misclassified 2 samples, in epochs 6-10, nothing was misclassified anymore. Would you think that changing the y-label to something like "number of updates" or "number of errors in a given epoch" would help to make things more clear?

screen shot 2016-08-22 at 1 17 44 pm

simon-clematide commented 7 years ago

I think "number of updates" would be the most exact, "(due to misclassifications)" could be added in the running text.

This number probably depends on the order of the examples. I noticed that your "real life" implementation of the perceptron shuffles the examples optionally (https://github.com/rasbt/mlxtend/blob/master/mlxtend/classifier/perceptron.py).

PS: I was asking myself whether the interpretation of the errors variable would be more straightforward if the actual update would be conditioned on a non-zero value of the computed update. Something like

               update = self.eta * (target - self.predict(xi))
               if update != 0.0:
                    self.w_[1:] += update * xi
                    self.w_[0] += update
                    errors += 1
rasbt commented 7 years ago

I think "number of updates" would be the most exact, "(due to misclassifications)" could be added in the running text.

I agree, that sounds like a good idea!

This number probably depends on the order of the examples. I noticed that your "real life" implementation of the perceptron shuffles the examples optionally (https://github.com/rasbt/mlxtend/blob/master/mlxtend/classifier/perceptron.py).

It does. Just checked the chapter and your are right, I didn't mention shuffling ... hm ... maybe I left it out at this point to keep things simple(r) and to introduce things incrementally ... I mentioned in the Adaline section "Furthermore, we will add an option to shuffle the training data before each epoch to avoid cycles when we are optimizing the cost function; via the random_state parameter, we allow the specifcation of a random seed for consistency" though. Maybe adding a back reference to the perceptron would be nice, and adding a few more comments in the notebooks certainly doesn't hurt (since I don't have page limits there :))

I was asking myself whether the interpretation of the errors variable would be more straightforward if the actual update would be conditioned on a non-zero value of the computed update. Something like

           update = self.eta * (target - self.predict(xi))
           if update != 0.0:
                self.w_[1:] += update * xi
                self.w_[0] += update
                errors += 1

Agreed. I could imagine that it would also be a tad faster on average.

Will make the updates some time next week, I really have to get back to my PyData presentation which is already Fri. :P Thanks for all the suggestions!

simon-clematide commented 7 years ago

Adding shuffling to the simple perceptron and observing the consequences seems like a good simple exercise for the learner at the end of Chapter 1 :-)

rasbt commented 7 years ago

@simon-clematide Thanks again for your suggestion, I really appreciate it and just added an additional comment about shuffling to the notebook (plus an example of how the perceptron code looks like if that's implemented). Also, the figure legend label ("Number of updates") should be adjusted now to avoid confusion