serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
55 stars 5 forks source link

[Google Shopping Product API] Reviews Pagination not Working #115

Closed aliayar closed 2 years ago

aliayar commented 2 years ago

SerpApi can't paginate to second and following pages to parse more reviews. Despite parameters, it gets stuck on the first page of reviews.

First page: The Playground | The Inspect | Google

shopping_first_page

Second page: The Playground | The Inspect | Google

shopping_second_page

Neither start nor page params are working at the moment.

ilyazub commented 2 years ago

Google started using rpt:C<symbol><symbol><symbol> filter (prds) parameter for pagination. Example URL.

While we're working on the fix, the rnum:100 filter parameter can be used to fetch at least the first 100 reviews. Our documentation is missing this filter.

Example Google URL

2022-04-21_21-18

Playground URL

image

rpt filter encoding

rpt value seems to be an encoded result offset. It depends on the results number filter parameter — rnum.

rnum:10 (default)

rpt values for subsequent pages:

1: CAA
2: CAg
3: CBA
4: CBg
5: CCA
6: CCg
7: CDA
8: CDg
9: CEA
10: CEg
11: CFA
12: CFg
13: CGA
14: CGg
15: CHA
16: CHg
17: CIA
18: CIg

Code to calculate

Ruby
rpt_page_filter_parameters = ("A".."I").map { |i| ["A", "g"].map { |j| "C#{i}#{j}" }.flatten }.flatten
=> ["CAA", "CAg", "CBA", "CBg", "CCA", "CCg", "CDA", "CDg", "CEA", "CEg", "CFA", "CFg", "CGA", "CGg", "CHA", "CHg", "CIA", "CIg"]
offset = 20
rpt_filter = rpt_page_filter_parameters[offset / 10]
prds = "#{rpt:rpt_filter}

prds == "rpt:CAg"
Python
>>> [ [ f"C{chr(c)}{j}" for j in "Ag" ] for c in list(range(ord("A"), ord("I") + 1)) ]
[['CAA', 'CAg'], ['CBA', 'CBg'], ['CCA', 'CCg'], ['CDA', 'CDg'], ['CEA', 'CEg'], ['CFA', 'CFg'], ['CGA', 'CGg'], ['CHA', 'CHg'], ['CIA', 'CIg']]

rnum:100

We're still working on decoding rpt parameter when rnum doesn't equal to 10.

Values for subsequent pages:

rpt:CHc
rpt:CNsB
rpt:CL8C
rpt:CKMD
rpt:CIcE
rpt:COsE
rpt:CM8F
rpt:CLMG
rpt:CJcH
rpt:CPsH
rpt:CN8I
rpt:CMMJ
rpt:CKcK
rpt:CIsL
rpt:CO8L
rpt:CNMM
rpt:CLcN
rpt:CJsO
rpt:CP8O
rpt:COMP
rpt:CMcQ
rpt:CKsR
rpt:CI8S
rpt:CPMS
rpt:CNcT
rpt:CLsU
rpt:CJ8V
rpt:CIMW
rpt:COcW
rpt:CMsX
rpt:CK8Y
rpt:CJMZ
rpt:CPcZ
rpt:CNsa
rpt:CL8b
rpt:CKMc
rpt:CIcd
ilyazub commented 2 years ago

We've just shipped a fix. Sorry for the huge delay with this issue.

The next page URL is available via serpapi_pagination.next. There's also a serpapi_pagination.next_page_filter that contains the value for the filter parameter. Here's an example Playground page.

image