morpheus65535 / bazarr

Bazarr is a companion application to Sonarr and Radarr. It manages and downloads subtitles based on your requirements. You define your preferences by TV show or movie and Bazarr takes care of everything for you.
https://www.bazarr.media
GNU General Public License v3.0
2.89k stars 225 forks source link

Subscene throttling. AttributeError. Exception info: "'NoneType' object has no attribute 'group'" #1331

Closed Ash-Raimon closed 6 months ago

Ash-Raimon commented 3 years ago

Describe the bug When searching for subtitles for any movie/tv show given. Subscene always throttles and gives the following error: Throttling subscene for 10 minutes because of: AttributeError. Exception info: "'NoneType' object has no attribute 'group'"

To Reproduce Steps to reproduce the behavior:

  1. Go to any media
  2. auto search for subtitles
  3. When subscene is triggered for search, it stops (throttles).
  4. Displays the error in the log

Added the provided bazarr.log debug.

Starts from line 70. From line 101 and 102 is where the subscene error throttling shows.

Software (please complete the following information):

Additional context Captcha used is Death by Captcha.

Bazarr debug log: bazarr_2.log

morpheus65535 commented 3 years ago

Sorry but I can't reproduce. I've just tested multiple searches with Subscene by using DBC but it worked fine every time. Maybe you can try to empty your config/cache directory and try again.

Ash-Raimon commented 3 years ago

My cache directory is under /home/ashus/bazarr/data/cache while my config is under /home/ashus/bazarr/data/config which contains config.ini (all my settings) releases.txt and throttled_providers.dat

I wasn't sure if you meant cache only directory or both config and cache. However, I went ahead and emptied my cache directory.

bazarr (3).log

The provided log is emptied and debug-enabled.

Subscene error starts showing on line 125.

What's interesting is on that line, there's a part about captcha which shows: raise ValueError('Captcha')ValueError: CaptchaDuring handling of the above exception which is repeated for several times at different sections.

Also using notepad++ on windows, Line 125 Column 2261, it reads the following: site_key = re.search(r'data-sitekey="(.+?)"', resp.text).group(1)AttributeError: 'NoneType' object has no attribute 'group'|Traceback (most recent call last): File "/home/ashus/bazarr/bazarr/../libs/subliminal_patch/http.py", line 95, in _request

which is then followed by using isChallengeRequest function, which leads to raise ValueError('Captcha')ValueError:.

I can also try emptying my config directory, just wanted to make sure if that's the right directory to delete its content.

morpheus65535 commented 3 years ago

No you were right, I was only talking about the cache directory. Is there remaining credit on your DBC account? It look like DBC API is returning different answer for you...

Ash-Raimon commented 3 years ago

No you were right, I was only talking about the cache directory. Is there remaining credit on your DBC account? It look like DBC API is returning different answer for you...

yes I just checked and there's still $4.86 dollars on it, which is roughly 2800 captcha's left on my account. I may need to look into Anti-Captcha if it doesn't work, but wanted to be sure the fault was on DBC first.

If DBC was working on your end, then it means I may have something stuck or old cache. I know the provider addic7ed does not work with DBC as it did produce a clear error log about captcha not solved. But the subscene got me intrigued as it showed attributeError.

Is there another way I could try? Or maybe try a full reset and start over?

I did try adding bazarr4K for my 4K content, different from normal bazarr, and it showed the exact same error.

EDIT: The error log for addic7ed was showing something along the lines of: Exception info: "Addic7ed: Couldn't solve captcha!" then goes into: OSError. Exception info: 'Connection lost or timed out during API request'

apologies couldnt get the logs as trying to re-produce the debug log again for it. As addic7ed isn't related here, what I wanted to show is that addic7ed did show its error based on DBC's captcha's not solving or taking a longer time to solve. Subscene shows a different error, but don't know if it is related to maybe DBC is also taking longer times to solve?

Ash-Raimon commented 3 years ago

As an update, I've changed captchas to Anti-Captcha.

and re-tried subscene search, this is the debug log (its short and filtered). bazarr.log

Line 76 is where the error occurs, and it does show the same error as before.

Line 73 shows its creating session and tries to login

but on line 75 it shows the following before the error log: https://subscene.com:443 "GET /account/login HTTP/1.1" 403 None|

So its refusing the request

morpheus65535 commented 3 years ago

Your IP is probably blocked (temporarily I hope) by cloudflare. Wait a couple of days before trying again.

Ash-Raimon commented 3 years ago

will do then, probably this maybe. I think i'm having a bad luck with captchas lol. Anti-captcha has a dashboard that shows how long it takes to resolve a captcha. It took 80seconds, but lead to this pop-up saying:

DataTables warning: table id=search_result - Invalid JSON response. For more information about this error, please see http://datatables.net/tn/1

Probably unrelated, thought it might be useful to include this, maybe DBC and AC are slowing down today.

In any case, will test it again on the next weekend

morpheus65535 commented 3 years ago

I haven't seen this datatables error in a while. Since 0.9.4 we don't use datatables anymore so you shouldn't get into this anymore. Lets wait for a week and come back with the result of your test.

arabcoders commented 3 years ago

i can confirm the error is caused at least for me by Cloudflare, since i am running my *arr in the cloud they treat the IP differently than my home IP.

Is there anyway to use the anti captcha account to solve Cloudflare challenge?

Ash-Raimon commented 3 years ago

i can confirm the error is caused at least for me by Cloudflare, since i am running my *arr in the cloud they treat the IP differently than my home IP.

Is there anyway to use the anti captcha account to solve Cloudflare challenge?

I am using RP on a remote server but not using cloudflare though.

That error is showing up whenever my anti captcha is taking longer than expected (80 seconds long).

Will wait till next week to retry.

arabcoders commented 3 years ago

i can confirm the error is caused at least for me by Cloudflare, since i am running my *arr in the cloud they treat the IP differently than my home IP. Is there anyway to use the anti captcha account to solve Cloudflare challenge?

I am using RP on a remote server but not using cloudflare though.

That error is showing up whenever my anti captcha is taking longer than expected (80 seconds long).

Will wait till next week to retry.

i mean subscene is behind Cloudflare, which sometimes block ips, could you try running curl https://subscene.com/ via shell and see the response?

morpheus65535 commented 3 years ago

Is there anyway to use the anti captcha account to solve Cloudflare challenge?

That's exactly what happen. For some reason, it seems it fails for some reason and ip got temporary blocked.

arabcoders commented 3 years ago

Is there anyway to use the anti captcha account to solve Cloudflare challenge?

That's exactly what happen. For some reason, it seems it fails for some reason and ip got temporary blocked.

Yeah, i am not sure why, but on my side the anti captcha is not even running or using credits, just directly fails which exact same message. I temporarily solved it by proxying bazarr requests via my VPN container it's working fine for now. but unsure when it's going to fail again.

morpheus65535 commented 3 years ago

And what is the result of a curl to subscene without the vpn?

arabcoders commented 3 years ago

the usual cloudflare captcha page

excerpt

      <div class="cf-section cf-wrapper">
        <div class="cf-columns two">
          <div class="cf-column">
            <h2 data-translate="why_captcha_headline">Why do I have to complete a CAPTCHA?</h2>

            <p data-translate="why_captcha_detail">Completing the CAPTCHA proves you are a human and gives you temporary access to the web property.</p>
          </div>

          <div class="cf-column">
            <h2 data-translate="resolve_captcha_headline">What can I do to prevent this in the future?</h2>

            <p data-translate="resolve_captcha_antivirus">If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p>

            <p data-translate="resolve_captcha_network">If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.</p>

          </div>
        </div>
      </div>
Ash-Raimon commented 3 years ago

As an update, I just re-enabled the subscene provider using Anti-Captcha.

bazarr.log

I apologize for not removing the other providers (opensubtitles.org/.com and podnapisi).

However, subscene starts on line 169, and then follows up to the same AttributeError output as before.

EDIT:

I've also tried Addic7ed with Anti-Captcha, it doesn't seem to work but I can't figure out what is happening (its not the same error as subscene, it just doesn't respond.)

This is used with "8 Mile" Movie: bazarr (1).log

this is used with "The Amazing Spider-man" movie: bazarr (2).log

Ash-Raimon commented 3 years ago

Re-did the test by using DBC instead. Used a single provider per test and filtered out everything. This should give a more clearer debug result.

Addic7ed with DBC: bazarr (3).log

Subscene with DBC: bazarr (4).log

In short, whether I used DBC or Anti-Captcha, the end result is exactly the same (i think). Don't know what is causing this.

Ash-Raimon commented 3 years ago

i can confirm the error is caused at least for me by Cloudflare, since i am running my *arr in the cloud they treat the IP differently than my home IP. Is there anyway to use the anti captcha account to solve Cloudflare challenge?

I am using RP on a remote server but not using cloudflare though. That error is showing up whenever my anti captcha is taking longer than expected (80 seconds long). Will wait till next week to retry.

i mean subscene is behind Cloudflare, which sometimes block ips, could you try running curl https://subscene.com/ via shell and see the response?

And what is the result of a curl to subscene without the vpn?

I've done what @ArabCoders suggested, did a curl on both from home and from server side. for subscene it is indeed showing the cloudflare captcha, exactly what he said.

cURL from home shows OK: cURL Home - Subscene.txt

cURL from server side (shows the cloudflare message): cURL Server - Subscene.txt

Change their extension from .txt to .html to view the full page on exactly what is showing. Or use the notepad++ with language chosen as HTML. I couldn't upload .html extension here, so I changed it to .txt (for some reason, pastebin wasn't working for me).

For testing purposes, I also did a cURL for addic7ed, it works, both from server side and home. But for unknown reason it does not work in bazarr to fetch subtitles (debug error code uploaded previously).

morpheus65535 commented 3 years ago

For some reason, cloudflare is responding differently to you than to me. I can't reproduce your issue... it's kind of hard to fix something that isn't broken for me... You're sure you don't have anything that could be messing up your network call? PiHole or something?

Ash-Raimon commented 3 years ago

For some reason, cloudflare is responding differently to you than to me. I can't reproduce your issue... it's kind of hard to fix something that isn't broken for me... You're sure you don't have anything that could be messing up your network call? PiHole or something?

Don't have anything that blocks it as far as I remember. The only programs installed on my server (all manually installed, no docker, no VM etc..):

I also got a second bazarr for 4K movies.

no UFW, using hetzner's own firewall robot site, and all ports are open.

I have very limited knowledge in this area, but shouldn't DBC/Anti-Captcha solve the captchas that were put by cloudflare? I'm more confused on why doing curl from home (personal PC) works without using any anti-captcha, while server side its blocked by cloudflare captcha.

There's something I'm missing or some config I need to do.

EDIT: using hetzner's servers.

morpheus65535 commented 3 years ago

Ok so that's not uncommon at all! Those "dedicated" or seedbox providers often get IP blocked by Cloudflare and get immediately flagged by anti-bot protection. It could be simply that here.

Ash-Raimon commented 3 years ago

Maybe that's why.

Although subscene did work with me for a long time since the old bazarr UI (2019 i think?)

I cant pinpoint when exactly it stopped working, last time I checked it was perhaps last december.

I may have to see a way around this, might order a new IP and test it again.

morpheus65535 commented 3 years ago

The best way to confirm this would be to deploy the same setup (at least a minimal setup) at home so you can test a search from your home IP. It won't fix your issue on your hosted server but at least it would help confirm that's a blocked/tagged IP issue.

Ash-Raimon commented 3 years ago

Will try that out and see how it goes. Thanks

andrsm commented 3 years ago

Hi. I just want to add that I have the exact same problem. I get the same error in bazarr and same return from curl https://subscene.com/. My problems also started some months ago.

Ash-Raimon commented 3 years ago

Sorry for the late reply, but I can now confirm this works on my local host computer, not a surprise as it was suspected the dedicated server IPs might be blacklisted on subscene's cloudflare.

But the error page shows that its not being banned/blocked, more like its not passing cloudflare's own Captcha test.

But in the mean time, I believe a VPN is a temporary fix for dedicated servers hosted remotely.

Hi. I just want to add that I have the exact same problem. I get the same error in bazarr and same return from curl https://subscene.com/. My problems also started some months ago.

Just to double check here, is your server also hosted remotely? i.e. hetzner, OVH etc.. ? Or are you using bazarr locally on your own computer?

morpheus65535 commented 3 years ago

But the error page shows that its not being banned/blocked, more like its not passing cloudflare's own Captcha test.

From time to time, their implementation of Cloudflare anti-bot seems to return a different page that can't of course be parsed the same way the "normal" page. I've been able to reproduce it only once so it's kind of hard to debug.

andrsm commented 3 years ago

Just to double check here, is your server also hosted remotely? i.e. hetzner, OVH etc.. ? Or are you using bazarr locally on your own computer?

Yes, I have it remotely hosted.

morpheus65535 commented 3 years ago

This one should be fixed in 0.9.6-beta.32. @Ash-Raimon Can you test it and confirm if the issue is resolved?

Ash-Raimon commented 3 years ago

This one should be fixed in 0.9.6-beta.32. @Ash-Raimon Can you test it and confirm if the issue is resolved?

I'm on development branch version 0.9.6-beta.32

here's the log using bazarr with only subscene alongside Death By Captcha (also tested it with Anti-Captcha, both gives same result)

bazarr (1).log

On lines 49 and 50 shows the following error: Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version.

EDIT: Browsing directly to Subscene.com and trying to login gives a server error. It may or may not be related but wanted to see if I can regenerate cookies.

Ash-Raimon commented 3 years ago

I think my bazarr got overwritten.

I've changed the branch through webUI from master to development, and ran the task to update bazarr, which initially showed me version 0.9.6.

I then went into my ubuntu 20.04 headless server and did a git branch to just double check, which showed me i'm on master branch. Did a git status where I got a lot of unmerged files.

I tried to git pull, but faced errors and bazarr stopped working (giving error 500 on webUI) but in the end I tried to git fetch and I think I got bazarr overwritten where on webUI it shows i'm back on master branch on 0.9.2 version (which is older) but all series/movies are gone, alongside providers/subtitles and my settings.

I'm not really concerned as I can just re-do them again. But does bazarr have a backup database then I can import it from? I do have bazarr4k version running that I can use its db perhaps.

EDIT: I don't think I can revert back to an old database within bazarr as its all been erased. However, my bazarr4k folder used to be my backup of original bazarr everytime I wanted to manually update before I made it exclusively for 4k sonarr/radarr, do I just take its all content folders and dump it into bazarr? FIXED.

EDIT2: problem fixed with bazarr db. I don't know much about github, but I couldn't see a development branch within git's CLI, only "master" and "status".

morpheus65535 commented 3 years ago

We don't use git anymore to update Bazarr. You won't get the required assets using it. Bazarr will download the required assets from Github releases.

morpheus65535 commented 3 years ago

I didn't run into this non-free version during my test. I'll try to trigger it.

Ash-Raimon commented 3 years ago

We don't use git anymore to update Bazarr. You won't get the required assets using it. Bazarr will download the required assets from Github releases.

So I can use the Web UI to change from master to development branch. But is there a way if I wanted to revert back to master version?

I didn't run into this non-free version during my test. I'll try to trigger it.

Subscene had its login issues last time, I'll try to re-test it again.

andrsm commented 3 years ago

I got the same error "Throttling subscene for 10 minutes, until 21/06/26 01:03, because of: CloudflareChallengeError. Exception info: 'Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version.'"

Bazarr Version 0.9.6-beta.32 Sonarr Version 3.0.6.1265 Radarr Version 3.2.2.5080 Operating System Linux-5.4.0-74-generic-x86_64-with-glibc2.29 Python Version 3.8.5

morpheus65535 commented 3 years ago

@Ash-Raimon With upcoming 0.9.6-beta.33 you'll be able to move back to master from the UI.

Ash-Raimon commented 3 years ago

@Ash-Raimon With upcoming 0.9.6-beta.33 you'll be able to move back to master from the UI.

Awesome thanks!

thezoggy commented 6 months ago

can prob close this out as subscene is no more, site closed...