In this task, you will initialize a new Spring Boot project, add the JSoup dependency, and write a simple Java program to scrape data from a website of your choice. The goal is to get hands-on experience with setting up a Spring Boot project, using an external library (JSoup), and applying web scraping techniques.
Spring Boot Version: 3.x (the latest stable version)
Project Metadata:
Group:com.yourname
Artifact:web-scraper
Name:Web Scraper
Package Name:com.yourname.webscraper
Dependencies: Add the Spring Web dependency (to allow adding more features later).
Click on "Generate" to download the project as a ZIP file.
Unzip the downloaded file and open the project in IntelliJ IDEA.
Add the JSoup Dependency:
Open the pom.xml file in the root directory of your project.
Add the following JSoup dependency within the <dependencies> tag:
Save the pom.xml file and allow IntelliJ to update the Maven project to download the JSoup library.
Choose a Website for Scraping:
Select a website that you find interesting or relevant. It could be an e-commerce site, a news website, a blog, or any other public web page with data you'd like to extract.
Identify the specific data you want to scrape from the website (e.g., product names, prices, article titles, etc.).
Implement a CommandLineRunner Class:
Create a new Java class in the com.yourname.webscraper package that implements the CommandLineRunner interface.
In the run method, use JSoup to connect to the website you chose and scrape the data.
Print the scraped data to the console.
Run Your Application:
Run the Spring Boot application and observe the output in the console.
Ensure that the scraped data is displayed correctly.
Submit Your Work:
Once you’ve completed the task, submit the following:
A brief description of the website you chose and what data you scraped.
The Java code you wrote for the CommandLineRunner.
A screenshot of the console output showing the scraped data.
Task 2 :
Objective:
In this task, you will initialize a new Spring Boot project, add the JSoup dependency, and write a simple Java program to scrape data from a website of your choice. The goal is to get hands-on experience with setting up a Spring Boot project, using an external library (JSoup), and applying web scraping techniques.
Instructions:
com.yourname
web-scraper
Web Scraper
com.yourname.webscraper
Spring Web
dependency (to allow adding more features later).pom.xml
file in the root directory of your project.<dependencies>
tag:pom.xml
file and allow IntelliJ to update the Maven project to download the JSoup library.com.yourname.webscraper
package that implements theCommandLineRunner
interface.run
method, use JSoup to connect to the website you chose and scrape the data.Submit Your Work:
CommandLineRunner
.Resources that can help: