EricJMarti / inventory-hunter

⚡️ Get notified as soon as your next CPU, GPU, or game console is in stock
MIT License
1.12k stars 264 forks source link

Does Amazon scraper work for somebody ? #184

Open dinamoedm opened 3 years ago

dinamoedm commented 3 years ago

It looks like BoT doesn't load Amazon pages , hence it doesn't scrape any info .

Here is the log :

starting: python /src/worker/lean_and_mean.py I2021-03-24 02:04:44,039 [root] starting v1.6.6 with args: /src/run.py --alerter discord --webhook

I2021-03-24 02:04:44,285 [mzn_c_1] scraper initialized for https://www.amazon.ca/Intel-i7-10700K-Processor-Unlocked-BX8070110700K/dp/B086ML4XSB/ref=sr_1_1?dchild=1&keywords=10700k&qid=1616550790&s=electronics&sr=1-1 I2021-03-24 02:04:44,285 [mzn_c_2] scraper initialized for https://www.amazon.ca/dp/B086ML4XSB/aod W2021-03-24 02:04:49,974 [mzn_c_1] missing title: https://www.amazon.ca/Intel-i7-10700K-Processor-Unlocked-BX8070110700K/dp/B086ML4XSB/ref=sr_1_1?dchild=1&keywords=10700k&qid=1616550790&s=electronics&sr=1-1 I2021-03-24 02:04:49,976 [mzn_c_1] not in stock W2021-03-24 02:04:55,202 [mzn_c_2] missing title: https://www.amazon.ca/dp/B086ML4XSB/aod I2021-03-24 02:04:55,203 [mzn_c_2] not in stock W2021-03-24 02:05:00,881 [mzn_c_1] missing title: https://www.amazon.ca/Intel-i7-10700K-Processor-Unlocked-BX8070110700K/dp/B086ML4XSB/ref=sr_1_1?dchild=1&keywords=10700k&qid=1616550790&s=electronics&sr=1-1 I2021-03-24 02:05:00,882 [mzn_c_1] not in stock W2021-03-24 02:05:06,007 [mzn_c_2] missing title: https://www.amazon.ca/dp/B086ML4XSB/aod I2021-03-24 02:05:06,008 [mzn_c_2] not in stock W2021-03-24 02:05:12,021 [mzn_c_1] missing title: https://www.amazon.ca/Intel-i7-10700K-Processor-Unlocked-BX8070110700K/dp/B086ML4XSB/ref=sr_1_1?dchild=1&keywords=10700k&qid=1616550790&s=electronics&sr=1-1 I2021-03-24 02:05:12,022 [mzn_c_1] not in stock W2021-03-24 02:05:17,125 [mzn_c_2] missing title: https://www.amazon.ca/dp/B086ML4XSB/aod I2021-03-24 02:05:17,126 [mzn_c_2] not in stock W2021-03-24 02:05:22,770 [mzn_c_1] missing title: https://www.amazon.ca/Intel-i7-10700K-Processor-Unlocked-BX8070110700K/dp/B086ML4XSB/ref=sr_1_1?dchild=1&keywords=10700k&qid=1616550790&s=electronics&sr=1-1 I2021-03-24 02:05:22,771 [mzn_c_1] not in stock PS C:\Dev\inventory-hunter>

ch0chi commented 3 years ago

Was getting this error too. I double checked that the urls were correct, and that the parse params matched the elements on amazon. Decided to check data/{amazon folder} to see if any of the scraper generated html files could tell me anything. Unfortunately, it seems amazon assumes you are a bot and requires a captcha to be entered before continuing to the product page.

image

xBaronn commented 3 years ago

So there is no work around huh?

Crazyhead90 commented 3 years ago

It seems that it only occationally finds a webpage (sometimes there is a number where it doesn't say "missing title" and i'm not sure why, see nr 16 is found, but all others in this list are not:

W2021-05-07 13:01:03,473 [mzn_nl_14] missing title: https://www.amazon.nl/EVGA-GeForce-10G-P5-3897-KR-Technologie-Backplate/dp/B08HGYXP4C/ I2021-05-07 13:01:03,474 [mzn_nl_14] not in stock W2021-05-07 13:01:04,680 [mzn_nl_15] missing title: https://www.amazon.nl/EVGA-GeForce-GAMING-10G-P5-3881-KR-koeling/dp/B08HH1BMQQ/ I2021-05-07 13:01:04,681 [mzn_nl_15] not in stock I2021-05-07 13:01:06,610 [mzn_nl_16] now in stock at 2049.0... too expensive W2021-05-07 13:01:06,810 [mzn_nl_17] missing title: https://www.amazon.nl/GIGABYTE-WATERFORCE-koelsysteem-GV-N3080AORUSX-videokaart/dp/B083GSKZSW/ I2021-05-07 13:01:06,811 [mzn_nl_17] not in stock W2021-05-07 13:01:07,920 [mzn_nl_18] missing title: https://www.amazon.nl/GIGABYTE-WATERFORCE-koelsysteem-GV-N3090AORUSX-videokaart/dp/B08R5J94WP/ I2021-05-07 13:01:07,921 [mzn_nl_18] not in stock