TheAlgorithms / Python

All Algorithms implemented in Python
https://the-algorithms.com/
MIT License
184.13k stars 44.31k forks source link

403 HTTP errors in `web_programming/fetch_anime_and_play.py` #8762

Closed tianyizheng02 closed 1 year ago

tianyizheng02 commented 1 year ago

Repository commit

60b5ead62a32bc73e1a06bca57305d4a0f14e269

Python version (python --version)

Python 3.11.3

Dependencies version (pip freeze)

beautifulsoup4==4.12.2 black==22.12.0 certifi==2022.12.7 cfgv==3.3.1 charset-normalizer==2.1.1 click==8.1.3 commonmark==0.9.1 contourpy==1.0.6 cycler==0.11.0 distlib==0.3.6 fake-useragent==1.1.3 filelock==3.9.0 fonttools==4.38.0 identify==2.5.11 idna==3.4 joblib==1.2.0 kiwisolver==1.4.4 matplotlib==3.6.2 mpmath==1.2.1 mypy==0.991 mypy-extensions==0.4.3 nodeenv==1.7.0 numpy==1.24.1 packaging==22.0 pandas==1.5.2 pathspec==0.10.3 Pillow==9.4.0 pip==23.1.2 platformdirs==2.6.2 pre-commit==2.21.0 Pygments==2.13.0 pyparsing==3.0.9 python-dateutil==2.8.2 pytz==2022.7 PyYAML==6.0 requests==2.28.1 rich==12.6.0 ruff==0.0.260 scikit-learn==1.2.0 scipy==1.9.3 seaborn==0.12.2 setuptools==65.6.3 six==1.16.0 soupsieve==2.4.1 sympy==1.11.1 threadpoolctl==3.1.0 types-attrs==19.1.0 types-requests==2.28.11.7 types-urllib3==1.26.25.4 typing_extensions==4.4.0 urllib3==1.26.13 virtualenv==20.17.1

Expected behavior

Doctests in web_programming/fetch_anime_and_play.py should all pass

Actual behavior

They all fail:

$ python3 -m doctest -v web_programming/fetch_anime_and_play.py
Trying:
    type(get_anime_episode("/watch/kimetsu-no-yaiba/1"))
Expecting:
    <class 'list'>
**********************************************************************
File ".../web_programming/fetch_anime_and_play.py", line 121, in fetch_anime_and_play.get_anime_episode
Failed example:
    type(get_anime_episode("/watch/kimetsu-no-yaiba/1"))
Exception raised:
    Traceback (most recent call last):
      File ".../3.11/lib/python3.11/doctest.py", line 1351, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest fetch_anime_and_play.get_anime_episode[0]>", line 1, in <module>
        type(get_anime_episode("/watch/kimetsu-no-yaiba/1"))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File ".../web_programming/fetch_anime_and_play.py", line 139, in get_anime_episode
        response.raise_for_status()
      File ".../.venv/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://ww1.gogoanime2.org/watch/kimetsu-no-yaiba/1
Trying:
    type(search_anime_episode_list("/anime/kimetsu-no-yaiba"))
Expecting:
    <class 'list'>
**********************************************************************
File ".../web_programming/fetch_anime_and_play.py", line 74, in fetch_anime_and_play.search_anime_episode_list
Failed example:
    type(search_anime_episode_list("/anime/kimetsu-no-yaiba"))
Exception raised:
    Traceback (most recent call last):
      File ".../3.11/lib/python3.11/doctest.py", line 1351, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest fetch_anime_and_play.search_anime_episode_list[0]>", line 1, in <module>
        type(search_anime_episode_list("/anime/kimetsu-no-yaiba"))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File ".../web_programming/fetch_anime_and_play.py", line 90, in search_anime_episode_list
        response.raise_for_status()
      File ".../.venv/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://ww1.gogoanime2.org/anime/kimetsu-no-yaiba
Trying:
    type(search_scraper("demon_slayer"))
Expecting:
    <class 'list'>
**********************************************************************
File ".../web_programming/fetch_anime_and_play.py", line 16, in fetch_anime_and_play.search_scraper
Failed example:
    type(search_scraper("demon_slayer"))
Exception raised:
    Traceback (most recent call last):
      File ".../3.11/lib/python3.11/doctest.py", line 1351, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest fetch_anime_and_play.search_scraper[0]>", line 1, in <module>
        type(search_scraper("demon_slayer"))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File ".../web_programming/fetch_anime_and_play.py", line 37, in search_scraper
        response.raise_for_status()
      File ".../.venv/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://ww1.gogoanime2.org/search/demon_slayer
1 items had no tests:
    fetch_anime_and_play
**********************************************************************
3 items had failures:
   1 of   1 in fetch_anime_and_play.get_anime_episode
   1 of   1 in fetch_anime_and_play.search_anime_episode_list
   1 of   1 in fetch_anime_and_play.search_scraper
3 tests in 4 items.
0 passed and 3 failed.
***Test Failed*** 3 failures.

I believe that this started happening in this build run. That build run was for a PR that didn't edit fetch_anime_and_play.py at all, so the errors aren't due to an accidentally introduced bug. Furthermore, every build since that one has also failed for the same reason.

CaedenPH commented 1 year ago

@tianyizheng02 This is due to https://ww1.gogoanime2.org implementing cloudfare bot protection services

tianyizheng02 commented 1 year ago

@tianyizheng02 This is due to https://ww1.gogoanime2.org implementing cloudfare bot protection services

@CaedenPH I see, then it's definitely best to just mark it as broken like you did. Should we be concerned about this happening to the other scripts in web_programming? I think a good number of them are also web scrapers.

cclauss commented 1 year ago

Will our dependency https://pypi.org/project/fake-useragent (listed above) allow us to get past cloudflare?

tianyizheng02 commented 1 year ago

Will our dependency https://pypi.org/project/fake-useragent (listed above) allow us to get past cloudflare?

@cclauss I don't think so. This script had already been using the fake_useragent module, so it'd have to be some other method.

tianyizheng02 commented 1 year ago

Update: I just tried running the file locally again and now it's no longer broken. Not exactly sure what happened on GoGoAnime's end to fix it, but I think we just need to open a PR to re-enable the file now.

tianyizheng02 commented 1 year ago

File was re-enabled in #8988