serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
45 stars 3 forks source link

serpapi google scholar search results does not go beyond 1000 results #1569

Closed rohitkser closed 2 months ago

rohitkser commented 2 months ago

I am trying to fetch the search results from google scholar Here is the code

from serpapi import GoogleSearch

params = {
  "api_key": "b748576edc164cf001bccef2c070fe21a5fb77e2bbbbe0d4c6fd66f8ada1ca3b",
  "engine": "google_scholar",
  "q": "court",
  "hl": "en",
  "as_sdt": "4,216",
  "as_ylo": "2023",
  "as_yhi": "2023",
  "start": "990"
}

search = GoogleSearch(params)
results = search.get_dict()

The total number of search results = 1460. When I fetch page number 99 (ie. results 990 to 1000) the serpapi_pagingation block does not give me a link to the 1000th page. Here is how the serpapi_pagination block looks like

{
"current":100,
"previous_link": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=980",
"previous": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=980",
"other_pages": {
    "91": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=900",
    "92": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=910",
    "93": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=920",
    "94": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=930",
    "95": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=940",
    "96": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=950",
    "97": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=960",
    "98": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=970",
    "99": "https://serpapi.com/search.json?as_sdt=4%2C216&as_yhi=2023&as_ylo=2023&engine=google_scholar&hl=en&q=court&start=980"
}

I was expecting the serpapi_pagination block to include page number 100, 110, 120, ..., 1459

schaferyan commented 2 months ago

Hi @rohitkser, thanks for the issue!

The reason you can't paginate beyond 99 pages is that Google Scholar doesn't actually serve all 1,459 of the results it claims to have found. If you try this search in Google Scholar directly, you will also see that only 100 pages of results are provided:

https://scholar.google.com/scholar?start=990&q=court&hl=en&as_sdt=4,216&as_ylo=2023&as_yhi=2023

Unfortunately, there is no way for us to retrieve the results if they are not served by Google.

I hope that makes sense. Feel free to reach out to us at contact@serpapi.com any time if you have any other questions.