serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
52 stars 5 forks source link

[Google Search API] Searches return inconsistent number of `organic_results` with the `num` parameter #1061

Closed marm123 closed 11 months ago

marm123 commented 1 year ago

One of our customers reported that Google returns less than ten results even though the num=10 (and even though ten is the default value). I'm not sure if there is anything we can do about that, but maybe we can look into it and see if we can make the num parameter for mobile more consistent.

Search with num=10 and 9 organic_results:

image

Search with num=10 and 6 organic_results:

image

Playground - 9 results | Inspect - 9 results Playground - 6 results | Inspect - 6 results Intercom

martin-serpapi commented 11 months ago

Another customer reported this. In their case it is happening on Desktop.

Search with num=1 and 0 organic_results:

image

Playground - num=1 & 0 organic_results | Inspect num=1 & 0 organic_results

I was able to reproduce it with q=Coffee & num=10 (safe=off & filter=0 are set as well):

image

Playground - num=10 & 8 organic_results | Inspect - num=10 & 8 organic_results

Intercom

martin-serpapi commented 11 months ago

Adding urgent label as this is a crucial regression in our main API.

aliayar commented 11 months ago

I am not sure if this is an issue on our end. The default Google behavior is returning 10 results. Adding num=10 might be interpreted by Google as an indication of scraping activity hence the issues.

Martin's latest report even includes more of a bizarre situation where user tries to num=1. Google doesn't provide a UI for setting up num parameter on their results page. Due to that, only technical users are supposed to use num parameter by manually updating the url to add the num parameter. Due to unexpected parameters, Google might be having issues to respond to those requests.

Since Google also recently changed their pagination with endless scrolling, I believe that adding start=0 or adding num parameter with a number other than 10 or doubles of 10 creates issues for users.

Those who follow the defaults of Google search engine has never reported such issues and only reports that we have received are from users who mess with defaults parameters in unexpected ways.

I will change the label back to ready until one of our engineers investigates this and makes an informed decision regarding the future of this report.

martin-serpapi commented 11 months ago

I've tested sending a request without num and start parameters and with safe=off and with filter=0. There are still only 8 organic_results returned:

image

Playground | Inspect

I've executed the same request in Google, connected to a New York VPN, and Google is returning only 8 organic results in the response as well:

image

image

Google Search

I've also tested sending multiple requests to our Google Search API using a Node.js script for a few different search queries, using the following parameters:

const response = await getJson({
            engine: "google",
            q: topic,
            location: "New York, NY, United States",
            google_domain: "google.com",
            hl: "en",
            gl: "us",
            filter: 0,
            safe: "off",
            async: true,
            api_key: API_KEY
});

There are different number of organic_results returned for the different queries.

image

This behaviour seems to be coming from Google itself.

Shouldn't we restrict using a num parameter with a value less than 10, or a value that is not an increment of 10, on our end, if it's not supposed be used with other values? The start parameter as well, that 0 cannot be passed as a value, and only 10 and increments of 10 can be passed.

hartator commented 11 months ago

Yes, Google returns a number of organic results that are variable. num is not a guarantee.

I think it can be interesting to expand on @martin-serpapi's research and make a blog post just about num behaviors, expected values, and history.

Shouldn't we restrict using a num parameter with a value less than 10, or a value that is not an increment of 10, on our end, if it's not supposed be used with other values? The start parameter as well, that 0 cannot be passed as a value, and only 10 and increments of 10 can be passed.

I think we already talked about it somewhere. I don't remember what was our reasoning to not restrict num. I think some customers had valid use case with more creative use of num. E.g., num=1 to just have results smaller and faster.

Closing this as non-fixable.