dgPadBootcamps / Java-Bootcamp-2024

1 stars 0 forks source link

Task 2 : Adding Dependencies and Web Scraping #86

Closed mohammad-fahs closed 2 weeks ago

mohammad-fahs commented 4 weeks ago

Task 2 :

Objective:

In this task, you will initialize a new Spring Boot project, add the JSoup dependency, and write a simple Java program to scrape data from a website of your choice. The goal is to get hands-on experience with setting up a Spring Boot project, using an external library (JSoup), and applying web scraping techniques.

Instructions:

  1. Initialize a New Spring Boot Project:
    • Visit [Spring Initializr](https://start.spring.io/) and generate a new Spring Boot project with the following settings:
      • Project: Maven
      • Language: Java
      • Spring Boot Version: 3.x (the latest stable version)
      • Project Metadata:
        • Group: com.yourname
        • Artifact: web-scraper
        • Name: Web Scraper
        • Package Name: com.yourname.webscraper
      • Dependencies: Add the Spring Web dependency (to allow adding more features later).
    • Click on "Generate" to download the project as a ZIP file.
    • Unzip the downloaded file and open the project in IntelliJ IDEA.
  2. Add the JSoup Dependency:
    • Open the pom.xml file in the root directory of your project.
    • Add the following JSoup dependency within the <dependencies> tag:
    • Save the pom.xml file and allow IntelliJ to update the Maven project to download the JSoup library.
  3. Choose a Website for Scraping:
    • Select a website that you find interesting or relevant. It could be an e-commerce site, a news website, a blog, or any other public web page with data you'd like to extract.
    • Identify the specific data you want to scrape from the website (e.g., product names, prices, article titles, etc.).
  4. Implement a CommandLineRunner Class:
    • Create a new Java class in the com.yourname.webscraper package that implements the CommandLineRunner interface.
    • In the run method, use JSoup to connect to the website you chose and scrape the data.
    • Print the scraped data to the console.
  5. Run Your Application:
    • Run the Spring Boot application and observe the output in the console.
    • Ensure that the scraped data is displayed correctly.

Submit Your Work:

Resources that can help:

ZeinabHussieni commented 3 weeks ago

@mohammad-fahs

Task2.pdf