carla-caracola / cinem_extract_phyton_SQL

In this project, we extract cinema-related data from multiple sources using an API and web scraping. The data is then transformed and loaded into a MySQL database that we designed. We implemented this using MySQL Workbench, Jupyter Notebooks, Python, and libraries like Beautiful Soup, Selenium, MySQL Connector, Pandas, and NumPy.
1 stars 0 forks source link

🎬 Cinem Extract 🎬

📝 Project statement

The streaming platform BHO is constantly looking to improve the quality of its content and the satisfaction of its users. Our project consists of applying data analysis techniques to identify which are the most popular and best rated movies and short films from 2010 to date. This will help BHO make informed decisions about what content to promote and highlight on its platform.

⚙ Technologies

🛠 Installation

  1. Install Python:

    • Download and install Python from Python.org.
    • Make sure to check the option to add Python to your PATH during the installation.
  2. Install Jupyter Notebook:

    • Open a terminal or command prompt and run the following command to install Jupyter Notebook:
      pip install notebook
  3. Install Visual Studio Code:

  4. Install Visual Studio Code Extensions:

    • Open Visual Studio Code.
    • Go to the Extensions view by clicking on the square icon in the sidebar or pressing Ctrl+Shift+X.
    • Search for and install the following extensions:
      • Python (by Microsoft)
      • Jupyter (by Microsoft)
  5. Install MySQL and MySQL Workbench:

  6. Install Cinem Extract BDD

    • Go to the "File" menu at the top and select "Run SQL Script".
    • Navigate to the location of the cinem_extract.sql.
    • Select the file and click "Open".
    • Click "Run" to start the import process.
  7. Execute queries:

    • Open MySQL Workbench and connect to the database.
    • Open the queries_cinema_extract.sql file and run the SQL queries to get results.

📋 Phases

Phase 1: Data Extraction from MoviesDataset API

MoviesDataset API Make requests to this API and extract relevant information about the movies. Specifically, you will have to extract the following information:

🚀 About Women In Films

Women in films is a fictitious company formed by three students from the Adalab Data Analysis Bootcamp who work together to carry out the CinemExtract project from Module 2 of the Bootcamp. Thank you for reading us and we hope you find our project useful!

✒ Authors

🎁 Acknowledgements

To the ADALAB professors for the attention given throughout the project. To our classmates for their support and for sharing.