facundoolano / google-play-scraper

Node.js scraper to get data from Google Play
MIT License
2.3k stars 627 forks source link

Max. pages for scraping reviews is 111? / How to use throttle for not getting banned. #125

Closed eoehri closed 7 years ago

eoehri commented 7 years ago

Hi I found out that the maximal limit of scraping review pages is 111. First question: Can someone verify this? Is ist somehow possible to scrape more pages? Trying to scrape 112 pages will give me errors for the last 40 reviews (each page contains 40 reviews) So, each run is going to scrape at most 4480 reviews (111 * 40 = 4480). When doing this 3-4 times I'm getting banned. As far as I understood, the throttling has been implemented also for the reviews but I was not able to figure out how to use it. Could someone show in the example of the README file how to apply it? Concretely, in the following example (adding "throttle = 10" after "sort: ..." won't work):

var gplay = require('google-play-scraper');

gplay.reviews({ appId: 'com.mojang.minecraftpe', page: 0, sort: gplay.sort.RATING }).then(function(apps){ console.log('Retrieved ' + apps.length + ' reviews!'); }).catch(function(e){ console.log('There was an error fetching the reviews!'); });

facundoolano commented 7 years ago

There's no way that I know of to get more reviews for an app. This came up severeal times, you can review the closed issues for details.

Regarding throttle, what do you mean it doesn't work? You can run your app with NODE_DEBUG=google-play-scraper so you see every request being made and confirm they're batched to 10 per second when you supply throttle: 10.