BelgianNoise / colruyt-products-scraper

An application written in Go that scrapes Colruyt's API to retrieve all product listings.
https://colruyt-prijzen.nasaj.be/
6 stars 1 forks source link

Puppeteer for X-CG-APIKey #12

Closed Merilairon closed 1 month ago

Merilairon commented 2 months ago

Hey,

I have been running this scraper for a while now on my own server for Zottegem. I was just wondering why you are using Puppeteer to get the X-CG-APIKey as this is also available on the following endpoint:

https://www.colruyt.be/content/clp/nl.model.json

Feels like a lot of overhead of puppeteer could be avoided by using this endpoint to get the api key unless I am missing something else ofc...

BelgianNoise commented 2 months ago

I honestly just did whatever came to mind at that time. Didn't know it was in that json file :) Might change it later cause i don't think puppeteer was running correctly in gh actions. I just set that key via the env var, which was a quick patch lol.

BelgianNoise commented 2 months ago

@Merilairon do you run the scraper with proxies on gh actions? If so, which ones?

Merilairon commented 2 months ago

I've moved the whole setup to docker as it was running strangely on github actions. I'm using packet-stream right now. No issues so far.

BelgianNoise commented 1 month ago

I have been running this scraper for a while now on my own server for Zottegem.

Are you also running the frontend ? if so, any chance I could get access to it for a bit ? Would like to see if there are some differences in promotions from time to time. (+ I have family members that live around there)

I'm using packet-stream right now.

Any specific reason why you went with packet-stream ?