Hyphe get a random user agent for each crawl task from a webservice.
For some websites one might need to fix the user agent used by the crawler;
For instance website protected by cloudflare needs a cookie which is only valid for the user agent used to generate it.
Therefore for such websites, one needs to :
visit the website on a web browser and solve the potential captcha
get the cookie created and the user agent of the web browser used
set both the cookie and the user agent in the crawl config panel of this web entity in hyphe
So far setting the cookie is possible but not the User Agent.
One enhancement would be to add this parameter by crawl the same way than cookie.
The user agent settings at the crawl level would have precedence on the automatic random mechanism.
Hyphe get a random user agent for each crawl task from a webservice. For some websites one might need to fix the user agent used by the crawler; For instance website protected by cloudflare needs a cookie which is only valid for the user agent used to generate it. Therefore for such websites, one needs to :
So far setting the cookie is possible but not the User Agent. One enhancement would be to add this parameter by crawl the same way than cookie. The user agent settings at the crawl level would have precedence on the automatic random mechanism.