Different results in chapter 4

franciscoss commented 3 years ago

Hello! Thanks for your work on this :)

I'm working on chapter 4, on classification. In comparing your answers with the book, I found that the results using KNN on Caravan data are different. In the book, increasing K also improved precision, but not on your answers.

The weird thing is that when I run the code as instructed on the book, I get the same results as you using Tidymodels, i.e., no model improvement when increasing K.

Maybe the book is mistaken, but I found no related information here: https://www.statlearning.com/errata-second-edition

Do you know what could be causing the difference?

Thanks again,

EmilHvitfeldt commented 3 years ago

I don't know. It could be related to many things, such as the seed and specific method used to fit the model

RaymondBalise commented 3 years ago

I am also a huge fan of this project. The last example before Caravan is also a bit different. I don't understand why class::knn() and kknn::kknn() give different answers when I set the arguments to match. The only thing I can think of is there are ties on the closest neighbors and the packages use different rules to make the final call. I posted a simplified version of the question to Stack Overflow. Hopefully one of the package authors will notice.

EmilHvitfeldt commented 3 years ago

Sorry about the late answer. I took some time to investigate and I posted my findings on Stack Overflow

EmilHvitfeldt / ISLR-tidymodels-labs

Different results in chapter 4 #12