jamesnicolas / yomichan-forvo-server

An audio server for yomichan that scrapes forvo for audio files
31 stars 16 forks source link

CloudFlare verification blocking requests #13

Open jamesnicolas opened 1 year ago

jamesnicolas commented 1 year ago

It looks like CloudFlare is blocking requests from Forvo. Potential workarounds:

Any suggestions/help is appreciated

Update: Seems like Forvo turned off their "Under Attack" mode for Cloudflare. If this happens again, let me know but you can try the fix in branch a branch cf-cloudscraper.

wolfv506 commented 1 year ago

hey I came up with a kinda hacky solution, by calling a script with cloudscraper from the anki init.py of the addon (since I was not able to add cloudscraper as a dependency directly in anki).
Attached is the external script (you have to change .txt to .py): forvo_fix.txt

First you have to add the following imports at the beginning of init.py: from subprocess import Popen, PIPE

Inside init.py of the addon you have to replace lines 85-120 in the function "word" with the following, specify the path of the external script:

    if self.config.show_gender:
        gender = "y"
    else:
        gender = "n"

    external_process = Popen(["python", PATH_TO_EXTERNAL_SCRIPT, "-w", w, "-l", self.config.language, "-g", gender], stdout=PIPE, stderr=PIPE)

    output, err = external_process.communicate()

    pronunciations = json.loads(output.decode().replace("'", '"'))
jamesnicolas commented 1 year ago

hey I came up with a kinda hacky solution, by calling a script with cloudscraper from the anki init.py of the addon (since I was not able to add cloudscraper as a dependency directly in anki).

Attached is the external script (you have to change .txt to .py): forvo_fix.txt

First you have to add the following imports at the beginning of init.py:

from subprocess import Popen, PIPE

Inside init.py of the addon you have to replace lines 85-120 in the function "word" with the following, specify the path of the external script:

    if self.config.show_gender:

        gender = "y"

    else:

        gender = "n"

    external_process = Popen(["python", PATH_TO_EXTERNAL_SCRIPT, "-w", w, "-l", self.config.language, "-g", gender], stdout=PIPE, stderr=PIPE)

    output, err = external_process.communicate()

    pronunciations = json.loads(output.decode().replace("'", '"'))

hey thanks for the suggestion! I actually was able to package cloudscraper with the addon, but it asked for a captcha. Maybe because cloudflare saw too much testing from my IP. I could publish a new version with cloudscraper packaged into the addon, but if cloudflare decides your IP is checking their verification too many times or something, it might block the user with a captcha. I'm trying to find a more robust solution

wolfv506 commented 1 year ago

ah okay got it. I just tested a bit and requested the forvo audio for like 100 words in a row and had no problems. How many requests did you do? Maybe another solution would be to include rotating proxies or something to prevent the captcha block?

jamesnicolas commented 1 year ago

Hmm okay I guess it depends on the person. Seems like the cloudscraper solution might just work better for some people, I'll deploy it later today at least as a mitigation

jamesnicolas commented 1 year ago

Oh actually it looks like the addon is working again. It's hard to test if cloudscraper is actually working or not now. If this happens again, I have the cloudscraper fix in a branch called cf-cloudscraper.

StefanVukovic99 commented 1 month ago

This seems to be happening for me right now, and the cloudscraper branch doesn't seem to help

edit: other users also reporting it

edit2: seems they turned it off, working now