tasos-py / Search-Engines-Scraper

Search google, bing, yahoo, and other search engines with python
MIT License
515 stars 137 forks source link

Feature request: return status code (and error msg) #18

Closed sershev closed 3 years ago

sershev commented 3 years ago

In first of all many thanks for your work! Currently if a Search Engine returns a non 200 code there is only a print message. It would be great if status code and error message would be accessible to detect bans (or other issues) by search engines.

tasos-py commented 3 years ago

I'm not sure if we need this feature, but I can help you implement it if you want. How do you think we should handle this? Maybe create a SearchEngine.last_response attribute?

sershev commented 3 years ago

SearchEngine.last_response would be a good option. If I'm not wrong there is already in engine.py the _is_ok function which returns True/False and print a message, maybe it is possible to extend this one to return/store the status code, this would be already very helpful. I'm also using MultipleSearchEngines class and want in optimal case understand which of the engines banned me.

For me this feature is important because my search queries are very specific, so I often get empty results and I need to distinguish this from a ban.

I could change the code myself, but the I will most likely get merge conflicts when pulling later new code version from your repository.

tasos-py commented 3 years ago

I added a SearchEngine.is_banned boolean attribute, with default value false that changes to true if the status code is 403, 429 or 503. Ban detection happens in SearchEngine._is_ok(), as you suggested; it makes good sense and it's easy to override. Also added a MultipleSearchEngines.banned_engines list that stores the names of banned engines. Thanks for your suggestions, looking forward to hearing your feedback.

sershev commented 3 years ago

Many thanks!