roflmuffin / node-anime-scraper

Scrapes information from Gogoanime to get Anime, Episode & Video information & urls.
59 stars 17 forks source link

MaxRetryError while fetching episodes #15

Closed ceesvanegmond closed 7 years ago

ceesvanegmond commented 7 years ago

I'm getting the following error.

MaxRetryError: Retrieving page after 5 attempts failed. Something has most likely been broken by external forces.

The script i'm using:

var Anime = require('anime-scraper').Anime

// Searches for anime using a POST request & uses first result
Anime.fromUrl('http://kissanime.to/Anime/K-On').then(function (anime) {
    console.log(anime);
    anime.fetchAllEpisodes().then(function(episodes) {
        console.log(episodes)
  })
})

It does collect the main anime, but fails when I fetch the episodes. Is this an existing problem?

roflmuffin commented 7 years ago

It could be that you are hitting the 'Please Enter Captcha' message if KA is detecting you are loading too many pages quickly.

Can you try fetching one episode and let me know the result?

var Anime = require('anime-scraper').Anime

// Searches for anime using a POST request & uses first result
Anime.fromUrl('http://kissanime.to/Anime/K-On').then(function (anime) {
    anime.episodes[0].fetch().then(function (episode) {
        console.log(episode)
    })
})
ceesvanegmond commented 7 years ago

@roflmuffin That does actually work! How to solve the Captcha problem?

roflmuffin commented 7 years ago

I believe they will block your requests if you request a certain amount too quickly, but the limit is removed if you manually complete the Captcha on their site.

I am working on a bottlenecking system but trying to work out what times work best.

ceesvanegmond commented 7 years ago

@roflmuffin Cool thx!

roflmuffin commented 7 years ago

Hi Cees,

In my latest commit d9d91a4 (NPM version 2.1.3) I have added the ability to use optional rate limiting. To implement this in your code above you would do:

var Anime = require('anime-scraper').Anime

Anime.setDelay(5000);

// Searches for anime using a POST request & uses first result
Anime.fromUrl('http://kissanime.to/Anime/K-On').then(function (anime) {
    console.log(anime);
    anime.fetchAllEpisodes().then(function(episodes) {
        console.log(episodes)
  })
})

This would set a delay of 5 seconds between each request. Brutally enough I was getting Captcha Errors when using any time less than 8 seconds when bulk retrieving episodes :-1:

I've also added a more detailed error so you should receive the reason KA is blocking you.

Let me know how it goes for you.