niveK77pur / gogoanime

Simple terminal downloader for Gogoanime episodes (entire seasons)
2 stars 1 forks source link

Captcha triggered? #10

Open niveK77pur opened 2 years ago

niveK77pur commented 2 years ago

When downloading anime, the script very often fails at the Extracting video links ... step with the following error message. When looking at the file written to /tmp/gogoanime.html it shows that a button needs to be pressed, verifying that I'm not a bot

Could not extract HTML for the following site. Timeout reached.
https://goload.pro/download?id=MTQ4NTMy&typesub=Gogoanime-SUB&title=Shingeki+no+Kyojin%3A+The+Final+Season+Episode+1
Page written to /tmp/gogoanime.html for debugging.
Traceback (most recent call last):
  File "/home/kuni/Videos/Anime/gogoanime/./gogoanime.py", line 206, in <module>
    soup = getDownloadPageHTML(browser, episode)
  File "/home/kuni/Videos/Anime/gogoanime/./gogoanime.py", line 54, in getDownloadPageHTML
    what_is_this = WebDriverWait(browser, timeout).until(
  File "/home/kuni/.local/lib/python3.10/site-packages/selenium/webdriver/support/wait.py", line 89, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:183:5
NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:395:5
element.find/</<@chrome://remote/content/marionette/element.js:300:16
niveK77pur commented 2 years ago

It seems that clicking on the link that is being displayed (before the line saying Page written ....) and solving the captcha, somehow magically makes the script not trigger captcha anymore. Note I solved the captcha in FireFox, and this script currently uses FireFox with selenium.

Maybe the cookies are somehow transferred or kept from my FireFox to selenium's FireFox? Either way, it seems cookies need to be introduced again into the code to avoid captcha. Maybe this even allows to remove the selenium dependency which is currently used to render the page. Selenium makes the code a lot slower, which is why trying to get rid of it is worthwhile investigating.

EDIT: After solving captcha, the script does not trigger it anymore but for some reason the episodes will download / create an empty file. You will have to remove the 3 lines corresponding to the concerned episodes, and run the script again.

EDIT 2: This workaround does not seem to work consistently.