JimmyLaurent / torrent-search-api

Yet another node torrent scraper (supports iptorrents, torrentleech, torrent9, torrentz2, 1337x, thepiratebay, Yggtorrent, TorrentProject, Eztv, Yts, LimeTorrents)
MIT License
394 stars 100 forks source link

Yggtorrent - CaptchaError: captcha #102

Closed gautiermorel closed 4 years ago

gautiermorel commented 4 years ago

Hello, I am facing this issue while requesting Yggtorrent... { CaptchaError: captcha 0|api | at validateResponse (/var/www/api/node_modules/cloudscraper/index.js:273:11) 0|api | at onCloudflareResponse (/var/www/api/node_modules/cloudscraper/index.js:222:5) 0|api | at onRequestResponse (/var/www/api/node_modules/cloudscraper/index.js:205:5) 0|api | at Request.<anonymous> (/var/www/api/node_modules/cloudscraper/index.js:149:7) 0|api | at Object.onceWrapper (events.js:277:13) 0|api | at Request.emit (events.js:189:13) 0|api | at Request.<anonymous> (/var/www/api/node_modules/request/request.js:1161:10) 0|api | at Request.emit (events.js:189:13) 0|api | at Gunzip.<anonymous> (/var/www/api/node_modules/request/request.js:1083:12) 0|api | at Object.onceWrapper (events.js:277:13) 0|api | at Gunzip.emit (events.js:194:15) 0|api | at endReadableNT (_stream_readable.js:1103:12) 0|api | at process._tickCallback (internal/process/next_tick.js:63:19) name: 'CaptchaError', message: 'captcha' }

Any clue to solve it ?

Thanks !

ylon commented 4 years ago

Hy, are you up to date with torrent-search-api ?

JimmyLaurent commented 4 years ago

Assuming you have the latest version of the package, you have this error because cloudfare ask for a captcha when the yggtorrent page is fetched and the library we're using to deal with cloudfare protected websites doesn't support captchas resolutions.

I think your IP (shared IP maybe ?) has been flagged by cloudfare for whatever reason ( suspicious activity ?) and now they try to protect the website with this captcha.

Check the cloudfare support page for more informations: here

GregDiego commented 4 years ago

Hi all, I'm facing the same issue with yggtorrent since yesterday and I have tried differents IP. It seems that something has changed with their captcha challenge.

JimmyLaurent commented 4 years ago

I confirm, I reproduced it. There's an issue opened here.

I'll wait for a patch and upgrade the package.

gautiermorel commented 4 years ago

Thanks :)

cyril-colin commented 4 years ago

Hi :)

I have a similar error when I make a search, but the response says : "1020, Access Denied (Custom Firewall Rules)"

Can you confirm that is the same problem ?

Thanks !

EDIT : My bad, already existing issue on cloudscraper github https://github.com/codemanki/cloudscraper/issues/343

JimmyLaurent commented 4 years ago

If you want a temporary solution, try this patch .

It works for me, but there's a new cloudfare challenge so it may fail in some cases.

Wunax commented 4 years ago

If you want a temporary solution, try this patch .

It works for me, but there's a new cloudfare challenge so it may fail in some cases.

The fix doesn't seem to work anymore...

cyril-colin commented 4 years ago

Well, cloudscraper is no longer supported... But there is an alternative, may be : https://github.com/codemanki/cloudscraper/issues/343#issuecomment-623480754

What do you think ?

JimmyLaurent commented 4 years ago

It's not that simple, the alternative package (hooman) only supports one kind of cloudfare challenge and cloudscraper handle multiple challenges and can chain them. So hooman may fix Yggtorrent and break other providers.

I'd rather prefer a fork of cloudscraper with the fix used in this package. Furthermore, there is a new type of challenge not supported by any of the 2 packages, so it will not work in all cases.

I'll check if I can fix cloudscraper, and If someone wants to make a PR with hooman, I could publish it under an experimental tag (ex: "torrent-search-api@2.1.1-experimental")

UPDATE: cloudscraper with the fix evoked before is equivalent to hooman, I obtained the same result, a 403 due to a captcha to solve.

andress134 commented 4 years ago

It's not that simple, the alternative package (hooman) only supports one kind of cloudfare challenge and cloudscraper handle multiple challenges and can chain them. So hooman may fix Yggtorrent and break other providers.

I'd rather prefer a fork of cloudscraper with the fix used in this package. Furthermore, there is a new type of challenge not supported by any of the 2 packages, so it will not work in all cases.

I'll check if I can fix cloudscraper, and If someone wants to make a PR with hooman, I could publish it under an experimental tag (ex: "torrent-search-api@2.1.1-experimental")

UPDATE: cloudscraper with the fix evoked before is equivalent to hooman, I obtained the same result, a 403 due to a captcha to solve.

You can fix cloudscraper? have updated new vers fixed by @bestplay9384 but still receive error , captcha error on uam target, or CloudflareError: 1020, Access Denied (Custom Firewall Rules)

JimmyLaurent commented 4 years ago

I tried to fix cloudscraper but it's tricky, I'm kind of stuck on the captcha part.

Even If I think it's possible to bypass either cloudfare bots and hcaptcha service, I don't think It will last long term and I don't have the energy to patch it every week. It's a cat and mouse game and the cloudfare team is reactive.

For a more reliable version, but also more resource consuming, I'd use puppeteer (chrome headless) to retrieve the "cf_clearance" cookie from cloudfare. It means that you'd have to fire up an instance of chrome every 24h (cookie expiration time) for a short period. Once you have this cookie, it's pretty easy to inject it in "torrent-search-api".

I tried the following code and It seems to work fine:

const yggTorrentProvier = TorrentSearchApi.getProvider('Yggtorrent');
// Fix 1010 error
yggTorrentProvier.headers = {
      'User-Agent':
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
    };
// Avoid cf challenges
yggTorrentProvier.setCookies(
      ['cf_clearance=XXXX09b8166c53752ddaf28ecd1a209cea5b5237-1589898270-0-150;']
);

I'll swap the cloudscraper package If there's a new player in this game (hooman could be a good candidate but it doesn't fix our issue for now).

andress134 commented 4 years ago

Hmm, cloudscraper is working fine on captcha for me, but for uam don't work, or some times work some no, idk why If i run cloudscraper 10 times, 6 times give me captcha error (on uam website) and 3 times give me error, and 1 time get cookie and work I lost my old discord and i lost also discord of ParserError codemanki (one of the coders for cloudscraper) him every time have fixed for me cloudscraper, and now i don;t know where i can contact him About hooman i teste hooman but it's far from as good as a cloudscraper

setTimeout callback extraction failed at onChallenge (/root/cloudscraper/index.js:328:21)

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, 19 May 2020 19:11, Jim notifications@github.com wrote:

I tried to fix cloudscraper but it's tricky, I'm kind of stuck on the captcha part.

Even If I think it's possible to bypass either cloudfare bots and hcaptcha service, I don't think It will last long term and I don't have the energy to patch it every week. It's a cat and mouse game and the cloudfare team is reactive.

For a more reliable version, but also more resource consuming, I'd use puppeteer (chrome headless) to retrieve the "cf_clearance" cookie from cloudfare. It means that you'd have to fire up an instance of chrome every 24h (cookie expiration time) for a short period. Once you have this cookie, it's pretty easy to inject it in "torrent-search-api".

I tried the following code and It seems to work fine:

const

yggTorrentProvier

=

TorrentSearchApi

.

getProvider

(

'Yggtorrent'

)

;

// Fix 1010 error

yggTorrentProvier

.

headers

=

{

'User-Agent'

:

'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'

}

;

// Avoid cf challenges

yggTorrentProvier

.

setCookies

(

[

'cf_clearance=XXXX09b8166c53752ddaf28ecd1a209cea5b5237-1589898270-0-150;'

]

)

;

I'll swap the cloudscraper If there's a new player in this game (hooman could be a good candidate but it doesn't fix our issue for now).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

unixfox commented 4 years ago

I got fed up with the Cloudflare anti bot, so I created a proxy that internally use Chromium to fetch the page: https://github.com/unixfox/pupflare.

It works pretty great and there is no need to set any "cloudflare cookie" because just like a normal Chromium it stores the needed cookies while the server is up.

If you would like to use it with torrent-search-api library modify the base URL and disable the internal cloudscraper like this:

const yggTorrentProvider = TorrentSearchApi.getProvider('Yggtorrent');
yggTorrentProvider.baseUrl = "http://localhost:3000/?url=https://www2.yggtorrent.se";
yggTorrentProvider.enableCloudFareBypass = false;

EDIT & Update: I added the ability to do POST requests and handle the files like torrents (thanks Google Chrome, its was a nightmare to implement that) so it works perfectly with all the functionalities of torrent-search-api now.

gautiermorel commented 4 years ago

I got fed up with the Cloudflare anti bot, so I created a proxy that internally use Chromium to fetch the page: https://github.com/unixfox/pupflare.

It works pretty great and there is no need to set any "cloudflare cookie" because just like a normal Chromium it stores the needed cookies while the server is up.

If you would like to use it with torrent-search-api library modify the base URL and disable the internal cloudscraper like this:

const yggTorrentProvider = TorrentSearchApi.getProvider('Yggtorrent');
yggTorrentProvider.baseUrl = "http://localhost:3000/?url=https://www2.yggtorrent.se";
yggTorrentProvider.enableCloudFareBypass = false;

EDIT & Update: I added the ability to do POST requests and handle the files like torrents (thanks Google Chrome, its was a nightmare to implement that) so it works perfectly with all the functionalities of torrent-search-api now.

You're awesome ! Works perfectly for me ! My only question is what's happen when the cf_clearance expires after one day ? Or do we need to run a CRON every day to maintain at least one query to renew the cloudflare clearance ?

unixfox commented 4 years ago

It act like a normal browser so indeed you will have to setup a CRON if you would like to have a working cf_clearance everytime you use torrent-search-api. If you do not setup a CRON there will just be a small delay when doing a new request if the cf_clearance expired. But that's a feature that I can add to my program: renewing the cf_clearance automatically as soon as it expires.

JimmyLaurent commented 4 years ago

I added a new experimental cloudflare bypass. It's using puppeteer, like pupflare but only for grabbing the "cf_clearance" cookie.

To use it, take the latest version of "torrent-search-api" and also add "cloudflare-scraper" to your project dependencies (It will remain as an optional dependency to reduce the size of the dependencies if you doesn't want to use a chromium instance).

unixfox commented 4 years ago

If you are having some issue with a CAPTCHA when using pupflare or the latest version of torrent-search-api: it seems like yggtorrent is having some issues with infinite redirection lately. As you can see hundred of people are talking about it on Twitter: https://twitter.com/search?q=yggtorrent&src=typed_query. Even on a clean browser I have a CAPTCHA that I need to complete in order to access to yggtorrent.

JimmyLaurent commented 4 years ago

The cloudflare protection has been deactivated, it may not last long but we're back in business, Yggtorrent provider is working.