PricePinion / PricePinion-Backend

This repo is the backend for PricePinion.
https://pricepinion-backend.azurewebsites.net/
0 stars 2 forks source link

Changed project structure. Created our first web scraper for Fred Meyer. #1

Closed maslindc2 closed 5 months ago

maslindc2 commented 5 months ago

Changed Project Structure

I have changed the call structure of the project. In the future this can become our express server for PricePinion or we can keep this as a separate system entirely. We can discuss this further in the future.

Fred Meyer Web Scraper

We can now scrape products from Fred Meyer URLs, as long as they list all of the products for a particular department. Currently we are scraping Baby products and Fresh Vegetables. One thing to note Fred Meyer has a counter of how many products are in a particular department. For Fresh Vegetables, this counter is wrong. Fred Meyer only shows 264 products despite saying 484 products are available for Vegetables.

Currently each URL is stored to it's own array. Once it finishes, it prints the length for each URL's result array. When you run the program it may take up to 2 minutes to fully scrape both sites.

What's left to fix.

Currently there is a delay issue I have to figure out with Fred Meyer and their loading of products. It's not a problem with Vegetables but for the Baby food it takes an extra second for the final page to load. I have the delay set to 5 seconds, so when you run the code you may see Puppeteer freeze for 5 seconds before closing.