Closed heya5 closed 1 year ago
It happens sometimes. duckduckgo.com can block your ip for a few seconds if you send requests too often. In this case, you will get None. Just repeat the request.
Got it. Thanks!
How would you distinguish an actual empty result from an empty result due to timing? Please a something like notification/callback in case of http 429
It goes something like this:
[]
None
This package configured to run in single-threaded sequential mode. Do not run in multi-threaded mode to avoid errors !!!
If the result is empty, it will return
[]
If there is an error, it will return
None
Doesnt look like it: https://github.com/deedy5/duckduckgo_search/blob/a68b64a00b45e4ac0f955495f370ff9560a98693/duckduckgo_search/ddg.py#L47-L49
Here you state that the exception is just ignored and an empty list is returned
i.e. in both cases we get []
If you send queries with the parameter page=1, 2, 3, 4, etc., the api will return results infinitely. The results will just repeat in random order.
Therefore, to get all results (in this case ddg will return maximum 200 results) parameter max_results is added.
In this case queries are sent to the api in multithreading mode (to speed up) and checked through the cache, to remove duplicates and not to make unnecessary queries. But since api can sometimes return empty answer or error, for compatibility the request will return [] on error.
This is a peculiarity of implementation. I.e. this package is tuned to pull all results as fast as possible.
If you will not use max_results parameter, the api will return results from the first page (page=1). And if you will do it in different threads then you will get errors, because api will block you.
The package will return None if there was an error when receiving the vqd, which clearly indicates that your ip is temporarily blocked.
@deedy5 example:
import duckduckgo_search
a=True
i=0
while a:
a=duckduckgo_search.ddg('"test test"')
print(f"{i} {len(a)}")
i+=1
out
0 29
1 29
2 29
3 29
4 29
5 29
6 29
7 28
8 29
9 29
10 29
11 29
12 29
13 29
14 29
15 29
16 29
17 28
18 28
19 28
20 28
21 28
22 29
23 28
24 0
obviously in 24 the request was blocked, but there is no way of knowing that because the result is just an empty list
This is something new. The 24th request returns a response with a status of 200, but the body of the response has an invalid json.
window.execDeep=function(){return{is506:1,bn:{ivc:1,ibc:0}};};
I need time to figure it out.
I ran the code, but sometimes I just got
None
.