zairza-cetb / HackWithZairza

Welcome to the repository for HackWithZairza.
MIT License
13 stars 21 forks source link

#10 Netflix Movie Dataset Analysis and Recommendation System #29

Closed Mukku27 closed 1 month ago

Mukku27 commented 1 month ago

Pull Request: Netflix Movie Dataset Analysis and Recommendation System

Overview

This pull request addresses the issue raised in #10 by providing a detailed analysis of the Netflix movie dataset, including data visualization, insights, and a movie recommendation system. The implementation meets the requirements outlined by the user, enhancing our project with comprehensive exploratory data analysis (EDA) and machine learning capabilities.

Key Features

  1. Dataset Download:

    • Utilizes KaggleHub to download the Netflix Prize dataset, ensuring seamless access to the necessary data.
  2. Exploratory Data Analysis (EDA):

    • Conducted a thorough EDA, including:
      • Visualization of rating distributions to understand user behavior.
      • Identification and handling of missing data.
      • Summary statistics for movies and customers, highlighting patterns in user ratings.
  3. Data Visualization:

    • Created engaging visualizations using Matplotlib and Seaborn:
      • Bar charts and histograms to depict rating distributions and the frequency of movie reviews.
      • Deductions drawn from these visualizations offer insights into user engagement and movie popularity.
  4. Dataset Trimming:

    • Cleaned the dataset by removing infrequently reviewed movies and customers to enhance the quality of subsequent analyses.
  5. Pivot Table Creation:

    • Developed a pivot table for efficient rating analysis, allowing for straightforward correlation calculations.
  6. Collaborative Filtering with SVD:

    • Implemented a Singular Value Decomposition (SVD) model for collaborative filtering to facilitate personalized movie recommendations.
    • Conducted cross-validation to evaluate model performance using RMSE and MAE metrics.
  7. Movie Recommendation Function:

    • Developed a function that generates movie recommendations based on user input.
    • Leverages Pearson correlation to find similar movies, enhancing user experience.

Usage

Visual Insights

Conclusion

This pull request effectively addresses the analysis of the Netflix movie dataset as per issue #10. It combines data exploration, insightful visualizations, and a functional recommendation system, ultimately enriching the project's analytical capabilities. The work lays the foundation for further enhancements, such as user-based recommendations or incorporating additional features.


Please review the changes and provide feedback or suggestions for improvements. Thank you!

FAMFRIENDLYMONKE commented 1 month ago

meets guidelines.