First i did Data Preprocessing which included handling null values
I filled these null values in body and acidity column with their Median values and also removed unnecessary rows
Then i handled Categorical features using Label Encoding
Then i did Data Visualisation to get more understanding of data, outliers and Unnecessary columns
I created Co-relationa Matrix to to remove unnecessary Features
Then i plotted several plots to get idea of outliers
Then i removed the outliers using Interquartile Range (IQR) Method.
Then i split my data into training and testing Datasets and used Feature Scaling (Standardisation) to to get all in data between -3 to 3 so that model can perform good
Then i used GridSearchCV on multiple Models To find the best accuracy with best parameters
Afterwards i Predicted to output from my testing dataset and then i made confusion matrix to check where out prediction and real value differs and evaluate the performance of model
Good work. Appreciate the detailed PR description
Could have converted the e values to integers in the matrice. There is a parameter to be checked True for this.
First i did Data Preprocessing which included handling null values
I filled these null values in body and acidity column with their Median values and also removed unnecessary rows
Then i handled Categorical features using Label Encoding
Then i did Data Visualisation to get more understanding of data, outliers and Unnecessary columns
I created Co-relationa Matrix to to remove unnecessary Features
Then i plotted several plots to get idea of outliers
Then i removed the outliers using Interquartile Range (IQR) Method.
Then i split my data into training and testing Datasets and used Feature Scaling (Standardisation) to to get all in data between -3 to 3 so that model can perform good
Then i used GridSearchCV on multiple Models To find the best accuracy with best parameters
Afterwards i Predicted to output from my testing dataset and then i made confusion matrix to check where out prediction and real value differs and evaluate the performance of model