aidaarouri / titanic_data_set

MIT License
0 stars 1 forks source link

Data Cleaning #4

Open SupervisionT opened 1 year ago

SupervisionT commented 1 year ago

In this issue we'll discuss how to clean the data, what's needed and actions. Please share your thoughts in the comments

aidaarouri commented 12 months ago

Remove the duplicate data or incomplete cases

cu2021 commented 12 months ago
  1. Check for missing values.
  2. Checking and search for the best way for handling the missing values.
  3. Search for imbalanced attributes in the data.
  4. handling the those imbalanced attributes in the data.
  5. Checking for the outliers in the data.
  6. handling the outliers in the data.
majdadel20 commented 12 months ago

First removing uncomplete or missing data , then use pandas.drop_duplicates()