DE-2410-A / web-scraping-anthony-harry

de-2410-a-challenges-web-scraping-web-scraping-activity created by GitHub Classroom
0 stars 0 forks source link

As a USER I want to be able to see the amount of books per category So that I can know which categories are good markets to enter #1

Open hdavidson42 opened 1 week ago

hdavidson42 commented 1 week ago

New Functionality: Scrape data from URL

Problem Definition

As a software developer, I need a function that can scrape data from the given URL and handle the response. This function will be a fundamental part of our book category data, allowing us to fetch book information for a specific category.

Functional Requirements

  1. [ ] The function should accept a URL as an input parameter.
  2. [ ] The function should send an HTTP GET request to the provided URL using the requests library.
  3. The function should handle the response:
    • [ ] If the response status code is 200, the function should return the HTML content of the page.
    • [ ] If the response status code is not 200, the function should return an error message indicating the status code and that the request was unsuccessful.
  4. [ ] The function should catch any exceptions that occur during the request and return an appropriate error message.

Testing Requirements

  1. Successful Request:

    • [ ] The function should receive a URL and make a GET request to that URL.
    • [ ] The function should return the HTML content if the response status code is 200.
  2. Unsuccessful Request:

    • [ ] The function should return an error message if the response status code is not 200.
    • [ ] The error message should include the status code and indicate that the request was unsuccessful.
  3. Exception Handling:

    • [ ] The function should catch any exceptions that occur during the request.
    • [ ] The function should return an error message indicating that an exception occurred.

Definition of Done

Squire-A commented 1 week ago

New Functionality: Fetch HTML Data from URL

Problem Definition

As a data engineer, I need a function that can send an HTTP GET request to a URL and handle the response. This function will be a fundamental part of our book data pipeline

Functional Requirements

  1. The function should accept a URL as an input parameter.
  2. The function should send an HTTP GET request to the URL using the requests library.
  3. The function should handle the response:
    • If the response status code is 200, the function should return the raw HTML data.
    • If the response status code is not 200, the function should return an error message indicating the status code and that the request was unsuccessful.
  4. The function should catch any exceptions that occur during the request and return an appropriate error message.

Testing Requirements

  1. Successful Request:

    • The function should receive a URL and make a GET request.
    • The function should return the HTML data if the response status code is 200.
  2. Unsuccessful Request:

    • The function should return an error message if the response status code is not 200.
    • The error message should include the status code and indicate that the request was unsuccessful.
  3. Exception Handling:

    • The function should catch any exceptions that occur during the request.
    • The function should return an error message indicating that an exception occurred.

Definition of Done