Closed akocmark closed 8 years ago
We don't use the request bit of this library ourselves, only the methods once the cheerio object has been loaded, but we don't seem to have a problem with that site in citoid (https://github.com/wikimedia/citoid/blob/master/lib/Scraper.js). It might be because we use cookies in citoid?
Instead of the url as the first argument, you can also pass an options object like with the request library: https://github.com/request/request#requestoptions-callback
So in this options object you can put the url, a cookie jar, a user-agent string, etc. Some websites might flag block if you make the request without the user-agent string, that could also be the issue.
Hi mvolz!
Thank you for the quick response. The user-agent header did the trick! Thank you so much!
Hi guys, thank you for this wonderful module.
I just wanna ask help regarding this ddos protection issue. It seems that this module can't get through some site with ddos protection(DDOS arrest). Like this gulfnews website: http://gulfnews.com/news/uae/health/health-authority-launches-campaign-for-safe-disposal-of-expired-medicines-1.1637255
Is there any way around this?
Thanks Mark