jodevsa / subscene_scraper

Library to download subtitles from subscene.com
13 stars 6 forks source link

error: Error: script got blocked 409 resCode #9

Closed mognia closed 4 years ago

mognia commented 6 years ago

when i wanna download an array of movies names throw this err:

error: Error: script got blocked 409 resCode at _callee$ (E:\Projects\subyab v2\node_modules\subscene_scraper\dist\main.bundle.js:177:61) at tryCatch (E:\Projects\subyab v2\node_modules\babel-polyfill\node_modules\regenerator-runtime\runtime.js:65:40) at Generator.invoke [as _invoke] (E:\Projects\subyab v2\node_modules\babel-polyfill\node_modules\regenerator-runtime\runtime.js:303:22) at Generator.prototype.(anonymous function) [as next] (E:\Projects\subyab v2\node_modules\babel-polyfill\node_modules\regenerator-runtime\runtime.js:117:21) at step (E:\Projects\subyab v2\node_modules\subscene_scraper\dist\main.bundle.js:205:191) at E:\Projects\subyab v2\node_modules\subscene_scraper\dist\main.bundle.js:205:361 at at process._tickCallback (internal/process/next_tick.js:188:7)

jodevsa commented 6 years ago

Hello @mmgh021,

This is due to a rate limiting technique used by subscene, it limits the script ability to retrieve large number of subtitles in short period of time.

A workaround/hack for this is to limit your rate at 2 subtitles each time:

ES7 example:

var process = require("process");
var {passiveDownloader} = require('subscene_scraper');
var EventEmitter = require('events');

function download_array(movies, language, path, concurrency) {
  let emitter = new EventEmitter();
  process.nextTick(async () => {
    for (let i = 0; i < movies.length; i += concurrency) {
      let downloaders = movies.slice(i, i + concurrency).map((m) => passiveDownloader(m));
      let results = await Promise.all(downloaders);
      emitter.emit("download",results);
    }
  })
 return emitter;
}

//// main code ////

var path = process.cwd();
var movies = ["batman", "shawshank redemption", "The Shawshank Redemption", "The God Father", "The Dark Night", "Pulp Fiction", "The good, The bad and the Ugly"];
var lang = "english";
var concurrency = 2;
download_array(movies, lang, path, concurrency).on('download',function(results){
    console.log("downloaded: "+results.length+" movies..",results);

});

Will be added soon the package.

note: Subscene only rate limit search requests, a throttle solution could be added to 1.4.0 based on this finding.