Omdena-NIC-Nepal / machine-learning-linear-regression-ai-dreamers

omdena-nic-nepal-classroom-1f2b87-machine_learning_linear_regression-Machine_Learning_Linear_Regress created by GitHub Classroom
1 stars 3 forks source link

Data Exploration for EDA.ipynb #6

Closed urs-santoshh closed 1 month ago

urs-santoshh commented 1 month ago

Description:

We need to perform data exploration and preprocessing for the dataset in the notebooks/EDA.ipynb notebook. The main objectives include understanding the data structure, identifying relationships between features and the target variable, and handling missing values and outliers.

Tasks:

  1. Load the dataset:

    • Ensure the dataset is correctly loaded into the notebook.
  2. Explore the data structure:

    • Examine the types and structure of the data.
    • Generate summary statistics for numerical and categorical features.
  3. Visualize relationships:

    • Create visualizations to understand the relationships between different features and the target variable.
    • Use plots such as histograms, box plots, scatter plots, and correlation heatmaps.
  4. Identify missing values and outliers:

    • Detect and quantify missing values in the dataset.
    • Identify outliers and consider their impact on the analysis.
  5. Document findings:

    • Provide a summary of findings from the exploration phase.
    • Suggest potential next steps for data preprocessing based on the identified issues.

Assignment:

Assign this issue to @sharad3595 . The assignee is responsible for completing the tasks outlined above and documenting their process and findings in the notebooks/EDA.ipynb notebook.

Deadline:

[Specify Deadline]

sharad3595 commented 1 month ago

i will be doing it ok?