dgPadBootcamps / Java-Bootcamp-2024

1 stars 0 forks source link

Task 2 : Adding Dependencies and Web Scraping #84

Closed mohammad-fahs closed 2 months ago

mohammad-fahs commented 2 months ago

Task 2 :

Objective:

In this task, you will initialize a new Spring Boot project, add the JSoup dependency, and write a simple Java program to scrape data from a website of your choice. The goal is to get hands-on experience with setting up a Spring Boot project, using an external library (JSoup), and applying web scraping techniques.

Instructions:

  1. Initialize a New Spring Boot Project:
    • Visit [Spring Initializr](https://start.spring.io/) and generate a new Spring Boot project with the following settings:
      • Project: Maven
      • Language: Java
      • Spring Boot Version: 3.x (the latest stable version)
      • Project Metadata:
        • Group: com.yourname
        • Artifact: web-scraper
        • Name: Web Scraper
        • Package Name: com.yourname.webscraper
      • Dependencies: Add the Spring Web dependency (to allow adding more features later).
    • Click on "Generate" to download the project as a ZIP file.
    • Unzip the downloaded file and open the project in IntelliJ IDEA.
  2. Add the JSoup Dependency:
    • Open the pom.xml file in the root directory of your project.
    • Add the following JSoup dependency within the <dependencies> tag:
    • Save the pom.xml file and allow IntelliJ to update the Maven project to download the JSoup library.
  3. Choose a Website for Scraping:
    • Select a website that you find interesting or relevant. It could be an e-commerce site, a news website, a blog, or any other public web page with data you'd like to extract.
    • Identify the specific data you want to scrape from the website (e.g., product names, prices, article titles, etc.).
  4. Implement a CommandLineRunner Class:
    • Create a new Java class in the com.yourname.webscraper package that implements the CommandLineRunner interface.
    • In the run method, use JSoup to connect to the website you chose and scrape the data.
    • Print the scraped data to the console.
  5. Run Your Application:
    • Run the Spring Boot application and observe the output in the console.
    • Ensure that the scraped data is displayed correctly.

Submit Your Work:

Resources that can help:

HaneenHammoud commented 2 months ago

my website that i choose is responsible for soups recipes and the way and ingredients and names to make them .it is all about how to make soups and some how salads for interested visitors. Screenshot (121)

web-scrapper.zip

mohammad-fahs commented 2 months ago

@haneendbouk can i please show me a sample of the scraped output ?

HaneenHammoud commented 2 months ago

What do you mean by scrapped output

On Sun, 18 Aug 2024, 13:54 Mohamad fahs, @.***> wrote:

@haneendbouk https://github.com/haneendbouk can i please see a sample of the scraped output ?

— Reply to this email directly, view it on GitHub https://github.com/dgPadBootcamps/Java-Bootcamp-2024/issues/84#issuecomment-2295218036, or unsubscribe https://github.com/notifications/unsubscribe-auth/BHND3ATLRPZJBKXHGQSXSXLZSB4OZAVCNFSM6AAAAABMRT3BWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGIYTQMBTGY . You are receiving this because you were mentioned.Message ID: @.***>

mohammad-fahs commented 2 months ago

@haneendbouk ok thank you it appearfed