Open rmfkdehd opened 2 years ago
You sure that the site is up?
Also, are you sure that you aren't banned?
I can still go into regular chrome.. no problem at all.
weird. maybe the site requires JS and if you don't have it, bans you?
otherwise idk
@TheTechRobo please don't speculate like this in the issues, try to reproduce the issue yourself if you're interested in it.
Anyway, I see
DDoS protection by <a rel="noopener noreferrer" href="https://www.cloudflare.com/5xx-error-landing/" target="_blank">Cloudflare</a>
in the resulting WARC when trying to crawl this forum.
cloudflare is known to block bots sending the wrong TLS fingerprint. It is probably picking up on grab-site's 'incorrect' TLS fingerprint, which does not match the browser it claims to be (Firefox). We might be able to fix that in ludios_wpull.
@TheTechRobo please don't speculate like this in the issues, try to reproduce the issue yourself if you're interested in it.
@ivan Gotcha. :+1:
I installed grab-site on ubuntu 20.04 using nix.
The command I use is 'grab-site https://www.forexfactory.com/forums --concurrency=1' .
Example.com and other sites completed crawling, but the 'https://www.forexfactory.com/' site failed to crawl. I've also tried with sub-addresses.
Below is the log.