Closed benthetechguy closed 2 years ago
I got around this by using the same request headers as your browser does. Try fetching it with your browser after completing necessary CAPTCHAs and using the "Network" panel to see request headers. You can set headers in the Scraper constructor with the headers
kwarg.
If you need more help let me know.
These are the correct headers to copy, right?
How do I put this into the Scraper? As a dict of the options? Would pcpp = Scraper(headers={sec-ch-ua-mobile: "?0", sec-ch-ua-platform: "Linux", Upgrade-Insecure-Requests: 1, User-Agent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"})
be right? (I'm skipping the sec-ch-ua
header because I don't even know where to start with entering that)
The one you want is "cookie". Don't send it here, its confidential.
So pcpp = Scraper(headers=x)
, x being whatever the value for cookie is?
pcpp = Scraper(headers={
"cookie": "the value of the header"
})```
Perfect, thanks. Does this cookie ever expire? Does it only work for one list or does it allow access to all of PCPP?
I believe it expires every year? If you look at response headers then there's a Max-Age
field in the set-cookie
header:
31449600 seconds = basically a year
So you might need to update it every now and then but should be fine.
Okay, perfect. I'll get him to test it, and if it works I'll close the issue. Thanks for the quick and helpful response!
No problem!
All good?
I'm sorry, I forgot all about this issue. It worked.
nice
I'm developing a Discord bot and used this library to post an embed with the details of a PCPP list whenever the link for one is posted by a user. Here's my code:
It works perfectly for me, here's the result: The only problem is, the bot is hosted by my friend @Philipp-spec in Germany, and to view PCPartPicker lists he needs to go through a Cloudflare check first. As a result, the bot gives this error whenever it tries to scrape a page:
Is there any way to fix this? The whole point of the Cloudflare check is to make sure you're not a bot… Best way to reproduce is with a VPN, though sometimes it doesn't give you the check.