FlareSolverr / FlareSolverr

Proxy server to bypass Cloudflare protection
MIT License
7.66k stars 654 forks source link

Exception: The cookies provided by FlareSolverr are not valid #25

Closed ilike2burnthing closed 3 years ago

ilike2burnthing commented 3 years ago

Jackett - Docker v0.17.50-ls9 amd64 FlareSolver - Docker v1.2.0 amd64

When not using a proxy/VPN in Jackett's settings, this affects:

Indexers which are working when not using a proxy/VPN:

Indexers which get Exception (INDEXER): FlareSolverr was able to process the request, but a captcha was detected. Message: Captcha detected but no automatic solver is configured. (so would probably work if I set one up) when not using proxy/VPN:

However, the only indexers which work when using a proxy/VPN in Jackett's settings are:

The rest show that the cookies aren't valid, or that a captcha was detected. Any way to extend the proxy settings for Jackett to FlareSolver, or add similar environment variables to FlareSolver?

ngosang commented 3 years ago

The cookies provided by FlareSolverr are not valid

I'm getting this error in Epizod but cpasbien is working. Try to open the sites in the browser. In my case Epizod is not working in Chrome.

FlareSolverr was able to process the request, but a captcha was detected. Message: Captcha detected but no automatic solver is configured.

I see this too in some sites, it happens when CloudFlare shows the images to pick, hChaptcha. FlareSolverr has 2 captcha providers but I just tested 1 of them and it's not working. The good thing is that the challenge of images only appears in some requests. If you get the cookie in one of them, the cookie is saved for some time.

Any way to extend the proxy settings for Jackett to FlareSolver, or add similar environment variables to FlareSolver?

I don't use proxys so I never tested it. It should be possible to configure FlareSolverr's Chrome with the same proxy.

ilike2burnthing commented 3 years ago

Looks like the issue with cpasbien is a result of their Cloudflare implementation. I get the same page in browser whether I use a VPN or not - https://github.com/Jackett/Jackett/issues/10472#issuecomment-743468413 (I'm assuming it's region locked as I can access if I change my usual server to a French one). Nothing to do with FlareSolverr then.

https://wwv.epizod.tv/ and https://wwv.epizod.tv/?s= both work in browser for me.

ngosang commented 3 years ago

For epizod, could you run FlareSolverr with these environment vars: LOG_LEVEL=debug and LOG_HTML=true and share the traces? Be careful about sensitive information.

Proxy support => #26

OrpheeGT commented 3 years ago

Hello,

I have the same error on yggtorrent :

Jackett.Common.IndexerException: Exception (yggtorrent): FlareSolverr was able to process the request, but a captcha was detected. Message: Captcha detected but no automatic solver is configured. ---> FlareSolverrSharp.Exceptions.FlareSolverrException: FlareSolverr was able to process the request, but a captcha was detected. Message: Captcha detected but no automatic solver is configured. at FlareSolverrSharp.Solvers.FlareSolverr.<>cDisplayClass5_0.<b0>d.MoveNext() --- End of stack trace from previous location --- at FlareSolverrSharp.Utilities.SemaphoreLocker.LockAsync[T](Func1 worker) at FlareSolverrSharp.Solvers.FlareSolverr.Solve(HttpRequestMessage request) at FlareSolverrSharp.ClearanceHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken) at Jackett.Common.Utils.Clients.HttpWebClient2.Run(WebRequest webRequest) in /home/vsts/work/1/s/src/Jackett.Common/Utils/Clients/HttpWebClient2.cs:line 166 at Jackett.Common.Utils.Clients.WebClient.GetResultAsync(WebRequest request) in /home/vsts/work/1/s/src/Jackett.Common/Utils/Clients/WebClient.cs:line 184 at Jackett.Common.Indexers.BaseWebIndexer.RequestWithCookiesAsync(String url, String cookieOverride, RequestType method, String referer, IEnumerable1 data, Dictionary2 headers, String rawbody, Nullable1 emulateBrowser) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/BaseIndexer.cs:line 481 at Jackett.Common.Indexers.CardigannIndexer.PerformQuery(TorznabQuery query) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/CardigannIndexer.cs:line 1278 at Jackett.Common.Indexers.BaseIndexer.ResultsForQuery(TorznabQuery query, Boolean isMetaIndexer) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/BaseIndexer.cs:line 369 --- End of inner exception stack trace --- at Jackett.Common.Indexers.BaseIndexer.ResultsForQuery(TorznabQuery query, Boolean isMetaIndexer) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/BaseIndexer.cs:line 377 at Jackett.Common.Indexers.BaseWebIndexer.ResultsForQuery(TorznabQuery query, Boolean isMetaIndexer) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/BaseIndexer.cs:line 645 at Jackett.Common.Services.IndexerManagerService.TestIndexer(String name) in /home/vsts/work/1/s/src/Jackett.Common/Services/IndexerManagerService.cs:line 300 at Jackett.Server.Controllers.IndexerApiController.Test() in /home/vsts/work/1/s/src/Jackett.Server/Controllers/IndexerApiController.cs:line 132 at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.gAwaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.gAwaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.gAwaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.g__Awaited|19_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.gAwaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope) at Microsoft.AspNetCore.Routing.EndpointMiddleware.g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger) at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context) at Jackett.Server.Middleware.CustomExceptionHandler.Invoke(HttpContext httpContext) in /home/vsts/work/1/s/src/Jackett.Server/Middleware/CustomExceptionHandler.cs:line 61

ilike2burnthing commented 3 years ago

LOG_LEVEL=debug and LOG_HTML=true

flaresolverr@1.2.0 start
> node ./dist/index.js
2020-12-14T00:57:54.314Z INFO REQ-0 FlareSolverr v1.2.0 listening on http://0.0.0.0:8191
2020-12-14T00:58:35.642Z INFO REQ-1 Incoming request: POST /v1
2020-12-14T00:58:35.647Z INFO REQ-1 Params: {"maxTimeout":60000,"cmd":"request.get","url":"https://wwv.epizod.tv/?s=","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"}
2020-12-14T00:58:35.652Z DEBUG REQ-1 Launching headless browser...
2020-12-14T00:58:42.626Z DEBUG REQ-1 Using custom UA: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
2020-12-14T00:58:42.637Z DEBUG REQ-1 Adding custom headers: {}
2020-12-14T00:58:42.638Z DEBUG REQ-1 { headers: [Function (anonymous)] }
2020-12-14T00:58:42.653Z DEBUG REQ-1 Navigating to... https://wwv.epizod.tv/?s=
2020-12-14T00:58:42.943Z DEBUG REQ-1 {

  headers: {

    'upgrade-insecure-requests': '1',

    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36'

  }

}

2020-12-14T00:58:43.802Z INFO REQ-1 Cloudflare detected
2020-12-14T00:58:43.819Z DEBUG REQ-1 No '#trk_jschal_js' challenge element detected.
2020-12-14T00:58:43.822Z DEBUG REQ-1 No '.ray_id' challenge element detected.
2020-12-14T00:58:43.826Z DEBUG REQ-1 No '.attack-box' challenge element detected.
2020-12-14T00:58:43.852Z INFO REQ-1 Successful response in 8.21 s

Nothing redacted, that's all there is.

abeloin commented 3 years ago

Cloudflare hcaptcha is a little bit different than a normal hcaptcha.

We could install and enable by default hcaptcha-solver.

It work by clicking on random images hopping it will work.

More info regarding Cloudflare's hcaptcha implementation:

abeloin commented 3 years ago

@ilike2burnthing cpasbien is always presenting a Cloudflare hcaptcha for me from server in France, Spain, Canada and USA

For epizod.tv,from your log, it's detecting cloudflare but nothing after, weird.

ilike2burnthing commented 3 years ago

Even weirder, I had Windows Sandbox open so decided to check Epizod there as well - worked, no VPN, no FlareSolverr. Turns out the site is a stitched together mess and the indexer needs a lot of work to make any sense out of it (my stab at it - https://github.com/Jackett/Jackett/pull/10505).

I wouldn't be surprised if this isn't some clever DDoS protection, but rather it's just broken.

abeloin commented 3 years ago

Ah interesting, epizod.tv is blocking any request if the user agent contain: X11; Linux x86_64.

We will have to add a check in the code so that it return an error with a message instead of OK:

// Exception for epizod.tv
if (response.url().match(/epizod\.tv/gi) && (await page.content()).match(/error(\s)?code[^0-9]+1020/gi)) {
  await page.close()
  return ctx.errorResponse('Cloudflare has blocked this request (Code 1020 Detected via regex).')
}
{
    "status": "ok",
    "message": "",
    "startTimestamp": 0,
    "endTimestamp": 0,
    "version": "1.2.0",
    "solution": {
        "url": "https://wwv.epizod.tv/?s=test",
        "status": 403,
        "headers": {
            "status": "403",
            "date": "Mon, 14 Dec 2020 04:49:26 GMT",
            "content-type": "text/plain; charset=UTF-8",
            "content-length": "16",
            "set-cookie": "__cfduid=[random number]; expires=Wed, 13-Jan-21 04:49:26 GMT; path=/; domain=.epizod.tv; HttpOnly; SameSite=Lax; Secure",
            "x-frame-options": "SAMEORIGIN",
            "cache-control": "private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0",
            "expires": "Thu, 01 Jan 1970 00:00:01 GMT",
            "cf-request-id": "[random number]",
            "expect-ct": "max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"",
            "report-to": "{\"endpoints\":[{\"url\":\"https:\/\/a.nel.cloudflare.com\/report?s=[random number]\"}],\"group\":\"cf-nel\",\"max_age\":604800}",
            "nel": "{\"report_to\":\"cf-nel\",\"max_age\":604800}",
            "vary": "Accept-Encoding",
            "server": "cloudflare",
            "cf-ray": "mmm...ray"
        },
        "response": "<html><head></head><body><pre style=\"word-wrap: break-word; white-space: pre-wrap;\">error code: 1020</pre></body></html>",
        "cookies": [
            {
                "name": "__cfduid",
                "value": "[random number]",
                "domain": ".epizod.tv",
                "path": "/",
                "expires": 0,
                "size": 51,
                "httpOnly": true,
                "secure": true,
                "session": false,
                "sameSite": "Lax"
            }
        ],
        "userAgent": "X11; Linux x86_64"
    }
}
ngosang commented 3 years ago

LOG_LEVEL=debug and LOG_HTML=true

I was expecting to see more HTML source code in the traces => #30

We could install and enable by default hcaptcha-solver.

I did a little test but it wasn't able to solve the hCaptchas. I don't know if it was bad luck or the solver is not working. #31

Ah interesting, epizod.tv is blocking any request if the user agent contain: X11; Linux x86_64.

We can change the User-Agent in Jackett just for this site adding the HTTP header in the request. I don't know if it's possible with the current code but that's the way.

// Exception for epizod.tv if (response.url().match(/epizod.tv/gi)

@abeloin I don't like the idea of having exceptions for each site in the code. Is it possible to make it generic to work in other sites too?

abeloin commented 3 years ago

We can change the User-Agent in Jackett just for this site adding the HTTP header in the request. I don't know if it's possible with the current code but that's the way.

As far as I can tell while debugging YGGCookies the user-agent string is cosmetic(in Cardigann) and not use at all, it use the one in BrowserUtil.cs. Some indexer in c#, for example abnormal, use a custom useragent that work

I did a little test but it wasn't able to solve the hCaptchas. I don't know if it was bad luck or the solver is not working. #31

It isn't currently working, it is outputting to word nonsense. There is a patch to fix it but not yet merged. https://github.com/JimmyLaurent/hcaptcha-solver, pull request 12.

I'll check if it is working.

Edited

With the patch, I get:

statusCode: 403,
  error: {
    success: false,
    'error-codes': [ 'invalid-answers', 'invalid-motionData' ]
  },

Which would suggest that is not enough, meaning it is not enough anymore.

I don't like the idea of having exceptions for each site in the code. Is it possible to make it generic to work in other sites too?

I didn't like it either but I wanted to be safe in case it detected the word mile a part in the page. Like error code: this is a long line 1020

Today, I realized that I can just limit it by using {}: error(\s)?code[^0-9]{0,5}?1020

ngosang commented 3 years ago

As far as I can tell while debugging YGGCookies the user-agent string is cosmetic(in Cardigann) and not use at all, it use the one in BrowserUtil.cs. Some indexer in c#, for example abnormal, use a custom useragent that work

I will take a look and fix Jackett if required.

I didn't like it either but I wanted to be safe in case it detected the word mile a part in the page. Like error code: this is a long line 1020

It's safer to check the http response code than the response text. I'm getting 403 forbidden with Linux useragent in epizod.tv. If we can't find cloduflare selectors and response code = 403 i think it's safe to say the user is blocked or banned.

ngosang commented 3 years ago

@ilike2burnthing I think this can be closed.