Open JanPetterMG opened 8 years ago
At the moment, it's possible to generate large (fake or valid) robots.txt files, with the aim to trap the robots.txt crawler, slow down the server, and even cause it to hang or crash.
It's also (depending on the setup) possible to trap the crawler in an infinite retry-loop, if the external code utilizing this library, isn't handling repeating fatal errors correctly...
Related to #62
Feature request: Limit the maximum number of bytes to parse.
Source: Google
Source: Yandex