You have a great grasp of doing EDA and looking at the data in different ways and checking for correlations before jumping into models. Good job with boxplot. Great job comparing different models as well.
A few pointers and suggestions as I was going through your notebook:
1) While its great that there are values in every cell (no NaN values), there are a bunch of "unknown" values that might not help the predictability of y. I saw an lmplot for campaign/y/marital, which gives great info, but I noticed that all of the red "unknown" marital statuses were to the left (low campaign numbers). I don't have domain knowledge about this dataset, but I wonder how many unknowns are there and if models would run better with those dropped out.
2) For select_dtypes you can choose to select int AND also the floats (pass a list into parameter).
3) Lastly, this was probably just an oversight on your part, but you were getting such bizarre results from your KNN model because you were using the KNN Regressor instead of the KNN Classifier. Keep an eye out for that next time.
Great job though. Keep up good work !!!!
You have a great grasp of doing EDA and looking at the data in different ways and checking for correlations before jumping into models. Good job with boxplot. Great job comparing different models as well. A few pointers and suggestions as I was going through your notebook: 1) While its great that there are values in every cell (no NaN values), there are a bunch of "unknown" values that might not help the predictability of y. I saw an lmplot for campaign/y/marital, which gives great info, but I noticed that all of the red "unknown" marital statuses were to the left (low campaign numbers). I don't have domain knowledge about this dataset, but I wonder how many unknowns are there and if models would run better with those dropped out. 2) For select_dtypes you can choose to select int AND also the floats (pass a list into parameter). 3) Lastly, this was probably just an oversight on your part, but you were getting such bizarre results from your KNN model because you were using the KNN Regressor instead of the KNN Classifier. Keep an eye out for that next time. Great job though. Keep up good work !!!!