Closed sbrun closed 3 years ago
Hi @sbrun - Thank you for taking the time to submit this issue. The HTTP 429 is because Google rightfully thinks the script is a bot and is throttling the searches for your IP, so the exception looks correct.
From https://bugs.kali.org/view.php?id=7005
It SHOULD deal with the 429 gracefully and back off the request rate a bit."
So are you requesting backoff logic? I've played around with some, but it's hard to know how much time "for the server to get out of it's grumpy mood".
You're better off increasing the delay (through -e
) at the cost of taking longer to run or running the script through a bank of proxies. Another one of my tools, pagodo, encounters the same issues and that's basically what I recommend:
https://github.com/opsdisk/pagodo/blob/master/pagodo.py#L144
As for the metadata extraction, this was my stance on it: https://github.com/opsdisk/metagoofil#metadata-extraction
Hi, I don't know what is the best solution but I think it should not fail with a Python error. It looks like there is an error / bug in the script for the user and he is without any clue to solve it. Maybe you can catch the error and add a comment as you have done in pagodo?
For the metadata extraction it's not an issue from my point of view as you clearly decide to not keep this feature in the tool.
Hello, When you run this command(in kali):
metagoofil -d https://sans.org -t doc,pdf,xls -l 200 -o sans_files -f
It fails instead of correctly handling this exception:urllib.error.HTTPError: HTTP Error 429: Too Many Requests
Issue was first reported here: https://bugs.kali.org/view.php?id=7005