mfazliazran / skipfish

Automatically exported from code.google.com/p/skipfish
Apache License 2.0
0 stars 0 forks source link

How to prevent Limits Exceeded errors #134

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi lcamtuf,

How can I prevent errors like 20102 - Limits exceeded, fetch suppressed? (along 
with the message Too many previous fetch failures)

On some websites, I see a very high number of these, and it looks like it 
prevents the crawler from crawling every page, since pages where this is 
returned seem to not be indexed or crawled for additional links.

My only guess is - Is this a result of a website responding too slowly? (I am 
seeing these targets sometimes having 1-2 req/sec, even with -m 3 set, which I 
figure should reduce load on the webserver)

Thanks!

Original issue reported on code.google.com by Charlie....@gmail.com on 12 Jan 2012 at 8:07

GoogleCodeExporter commented 8 years ago
Some of the tests we do involve comparison of multiple server responses.  If 
the server is not stable, these tests will start giving false positives. At the 
same time, the attempt to reduce false positive impacts coverage of when 
testing against unstable servers. 

You can try the following:

1) Recompile with a higher value for BH_CHECKS
2) Scan using an input file which contains the URLs you like to test 

Btw,  -m is not enough for throttling the requests per second.  You could use 
"trickle" for this (see known issues section).  Or wait until 2.04b which 
should be out soon and will have a flag to limit the requests per second. 

Niels

Original comment by niels.he...@gmail.com on 5 Feb 2012 at 2:41

GoogleCodeExporter commented 8 years ago
Not related, but one area for improvement is that if the maximum number of 
failures is exceeded, the scanner will refuse to schedule new requests, but 
still processes the current queue (which may be up to few k requests). We 
should probably bail out sooner, because if the requests started timing out, 
this means waiting a longer while.

/mz

Original comment by lcam...@gmail.com on 5 Feb 2012 at 5:56

GoogleCodeExporter commented 8 years ago
Thanks for the comments guys. I'll look forward to the new version, and give it 
a shot limiting reqs/s to see if it corrects.

If it's still an issue, I'll try doing a web crawl and flat file input prior to 
avoid the coverage issue. 

Thanks!

Original comment by Charlie....@gmail.com on 6 Feb 2012 at 6:49