gimseng / 99-ML-Learning-Projects

A list of 99 machine learning projects for anyone interested to learn from coding and building projects
MIT License
575 stars 173 forks source link

Health-Vehicle Insurance Cross Sell Prediction (using kNN classification) #120

Closed tanviagwl98 closed 3 years ago

tanviagwl98 commented 3 years ago

Reference Issues/PRs

Fixes #108 . See also #111 .

What does this implement/fix? Explain your changes.

I proposed an exercise for ML practice and this is the solution for the same problem statement. In this KNN algorithm of classification is used. In this I have changed the name of the folder to avoid conflict for the author of the repository before merging.

Any other comments?

Feel free to make changes to improve performance of the model

tanviagwl98 commented 3 years ago

Hi, it looks like your project is for vehicle insurance & not health insurance, so please clarify that in your readme as it seems a little confusing & change the file name appropriately. Otherwise, it looks good.

tanviagwl98 commented 3 years ago

@AjayKhalsa thank you for checking this out. I have made the changes asked by you.

gimseng commented 3 years ago

@tanviagwl98 I am confused. Is this the same as #111 or not? (up to name change). If its the same, why not update there? If its not the same, could you remind me what's the difference?

tanviagwl98 commented 3 years ago

@gimseng Yes, it is the same. I changed the folder name in a new branch as I was not getting to update that in the same branch. I am new to open source contribution, if you can help me that would be great.

gimseng commented 3 years ago

@tanviagwl98 Thanks. Is it ok for us to close #111 then and focus on this? If so, please provide more details (maybe copy some from #111) on this PR on the codes and project.

tanviagwl98 commented 3 years ago

@gimseng It can be closed as I have made the changes in #120 to avoid conflict

gimseng commented 3 years ago

@tanviagwl98 Could you clarify if this is health insurance or vehicle insurance data? I think the kaggle link also has a very confusing description on the data. We should either write more detailed and clearer description of the data or find better data.

tanviagwl98 commented 3 years ago

@gimseng it is vehicular insurance based on person's preference having health insurance , so that is why it states cross sell prediction.

gimseng commented 3 years ago

@tanviagwl98 Thanks ! I see, so these clients are existing healthy insurance policy holders of this company, and they do not have vehicle insurance with the company. The goal is to predict if they are interested in subscribing to the vehicle insurance policy?

I'll clean up a few things to make that clear. Then will review the codes appropriately hopefully by the weekend. Thanks for the help !

gimseng commented 3 years ago

@tanviagwl98 I think the corr. matrix for vehicle age is problematic since as it stands, vehicle age=2 for all data. I think the codes you wanted for VehicleAge function should be:

def VehicleAge(age):
    if age == '< 1 Year':
        return 0
    elif age == '1-2 Year':
        return 1
    else:
        return 2

Please update this if you agree, then I'll merge and do a bit of clean up with links and paths.

Thanks !

tanviagwl98 commented 3 years ago

@gimseng thanks for spotting that error, I missed that and stuck with that. I am making the change in my kaggle notebook and sharing the link of the same.

gimseng commented 3 years ago

@tanviagwl98 Thanks!

Just a note on the readme, I'll revert it back to what I wrote before. The reason is since we treat each project as a collaborative project, it shouldn't refer to a particular person's kaggle / work, unless it has to do with data source.

I'll merge, thanks for contributing !

tanviagwl98 commented 3 years ago

@gimseng thank you for informing that. And yes thank you for providing with this opportunity to contribute. Hope to contribute in future too.