We're currently using courlan via trafilatura for some crawling and found that when trying to do liveness checks for a hosts url we're being blocked due to user agent headers, however, we're unable to change them. I noticed there's some commented out code in the redirection test which the is_live_page uses that references user agent headers.
Is there any interest in supporting changing the headers or having a different one set?
Hi!
We're currently using courlan via trafilatura for some crawling and found that when trying to do liveness checks for a hosts url we're being blocked due to user agent headers, however, we're unable to change them. I noticed there's some commented out code in the redirection test which the is_live_page uses that references user agent headers.
Is there any interest in supporting changing the headers or having a different one set?
Thanks.