Closed kagermanov27 closed 1 year ago
A workaround of using pagination()
with parse_qsl
, urlsplit
from urllib.parse
:
from serpapi import GoogleSearch
from urllib.parse import (parse_qsl, urlsplit)
params = {
'api_key': '...', # serpapi api key
'engine': 'google', # search engine
'q': 'minecraft', # search query
'hl': 'en' # language
# 'start': 0 # or explicitly add page number
}
search = GoogleSearch(params)
# iterate over all pages
while True:
results = search.get_dict()
print(f"Currenlty on page: {params.get('start')}")
if 'error' in results:
print(results['error'])
break
# for result in results.get('organic_results', []):
# print(result.get('position'), result.get('title'), sep='\n')
# check if the next page key is present in the JSON
# if present -> split URL in parts and update to the next page
if 'next' in results.get('serpapi_pagination', {}):
search.params_dict.update(dict(parse_qsl(urlsplit(results.get('serpapi_pagination').get('next')).query)))
else:
break
Output:
Currenlty on page: None # first page. if 'start' param is present and set explicitly to 0, None will become 0
Currenlty on page: 10 # second page
Currenlty on page: 20
Currenlty on page: 30
Currenlty on page: 40
Currenlty on page: 50
Currenlty on page: 60
Currenlty on page: 70
Example from the dashboard -> your searches page (Inspect last request (8th page), next
hash key is empty thus exists pagination) :
Since the
start
value starts from0
, the correct second page should be10
and not11
.This behaviour is causing a skip in pages also. The customers are getting confusing results:
Intercom Link First recognized by @marm123.
I think this part needs to be replaced by:
Whether it would cause any error on other engines is something I don't know. But it may also fix it for every other engine.