urllib3 versions are conflicting

benoit74 commented 8 months ago

In canadian_prepper recipe, we just saw with @RavanJAltaie and @Popolechien that we keep getting an error about ssl ciphers.

See for instance last task where everything else is supposed setup properly.

STDERR:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
botocore 1.29.5 requires urllib3<1.27,>=1.25.4, but you have urllib3 2.1.0 which is incompatible.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 22.0.4; however, version 23.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Traceback (most recent call last):
  File "/usr/local/bin/youtube2zim", line 33, in <module>
    sys.exit(load_entry_point('youtube2zim==2.1.18', 'console_scripts', 'youtube2zim')())
  File "/usr/local/lib/python3.8/site-packages/youtube2zim-2.1.18-py3.8.egg/youtube2zim/__main__.py", line 13, in main
    from youtube2zim.entrypoint import main as entry
  File "/usr/local/lib/python3.8/site-packages/youtube2zim-2.1.18-py3.8.egg/youtube2zim/entrypoint.py", line 10, in <module>
    from .scraper import Youtube2Zim
  File "/usr/local/lib/python3.8/site-packages/youtube2zim-2.1.18-py3.8.egg/youtube2zim/scraper.py", line 28, in <module>
    from kiwixstorage import KiwixStorage
  File "/usr/local/lib/python3.8/site-packages/kiwixstorage-0.8.3-py3.8.egg/kiwixstorage/__init__.py", line 27, in <module>
    import boto3
  File "/usr/local/lib/python3.8/site-packages/boto3-1.26.5-py3.8.egg/boto3/__init__.py", line 17, in <module>
    from boto3.session import Session
  File "/usr/local/lib/python3.8/site-packages/boto3-1.26.5-py3.8.egg/boto3/session.py", line 17, in <module>
    import botocore.session
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/session.py", line 26, in <module>
    import botocore.client
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/client.py", line 15, in <module>
    from botocore import waiter, xform_name
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/waiter.py", line 18, in <module>
    from botocore.docs.docstring import WaiterDocstring
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/docs/__init__.py", line 15, in <module>
    from botocore.docs.service import ServiceDocumenter
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/docs/service.py", line 14, in <module>
    from botocore.docs.client import ClientDocumenter, ClientExceptionsDocumenter
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/docs/client.py", line 14, in <module>
    from botocore.docs.example import ResponseExampleDocumenter
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/docs/example.py", line 13, in <module>
    from botocore.docs.shape import ShapeDocumenter
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/docs/shape.py", line 19, in <module>
    from botocore.utils import is_json_value_header
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/utils.py", line 35, in <module>
    import botocore.httpsession
  File "/usr/local/lib/python3.8/site-packages/botocore-1.29.5-py3.8.egg/botocore/httpsession.py", line 22, in <module>
    from urllib3.util.ssl_ import (
ImportError: cannot import name 'DEFAULT_CIPHERS' from 'urllib3.util.ssl_' (/usr/local/lib/python3.8/site-packages/urllib3/util/ssl_.py)

Looks like this is an incompatibility of urllib3 libraries. I did not investigated any further yet but I'm afraid it might be linked to the continuous update of yt-dlp library which might have updated urllib3 (there is nothing else which has been updated in this scraper for months) and might hence impact all youtube recipes.

benoit74 commented 8 months ago

Bingo, urllib3 dependency has been added to yt-dlp only 4 days ago: https://github.com/yt-dlp/yt-dlp/blame/master/requirements.txt

Luckily it seems they still support 1.29, so it is probably only a issue in the way we install / upgrade yt-dlp

benoit74 commented 8 months ago

Issue is in fact already solved in development version. Unfortunately there are other issues which are preventing this development version from being tested / released, I'm working on it with #185

benoit74 commented 8 months ago

I just started again canadian_prepper recipe with the (fixed) development version, I will monitor it and if it succeeds, I will release a 2.2.0 version of the scrapper with current development status (and postpone pending issues to a 2.3.0).

benoit74 commented 8 months ago

I cancelled the task because it felt like it did not made any progress, but it was just encoding the video in fact (and it takes time obviously). New task is behaving correctly so I will close this issue (nothing had to be done aside releasing new codebase) and release 2.2.0 with currently fixed issues.

rgaudin commented 8 months ago

Just to be clear (not sure I understand the process here), you should test your changes locally on your machine. Zimfarm should not be used to test scraper code.

benoit74 commented 8 months ago

I ran scraper first on my machine and it started successfully. Since Youtube is a long running task and the change was minimal (just fixed a minor bug at scraper startup + rollback to Python 3.10), I assumed it was ok and that we can merge the change.

However since the changes in main branch were pending since a long time and because the scraper was not working anymore on the Zimfarm anyway and because I wanted to make a release, I decided to start the dev image in one recipe to let it run a bit longer than on my machine and be sure it was ok to release.

If the change had been more significant or if the scraper was ok in Zimfarm or if it was not yet the moment to make a release, I wouldn't have tested this on the Zimfarm, I would have selected a small Youtube playlist and ran it to completion locally.

I'm ok that "in general", we shouldn't test on the Zimfarm, but I consider there are always good reasons to make exception.

Is this process ok for you or not? I don't really mind it this is not ok for you and you do not want any test to be done on the Zimfarm, I can live with it. Even if it feels a bit weird to me since somehow this is exactly what the content team is doing all day long when they are setting up recipes with a trial and error process.

benoit74 commented 8 months ago

@rgaudin @kelson42 is my answer ok for you or should I change my way of working? (without further guidance, I will continue to proceed "as-is")

rgaudin commented 8 months ago

I'm ok that "in general", we shouldn't test on the Zimfarm, but I consider there are always good reasons to make exception.

My POV as well.

it feels a bit weird to me since somehow this is exactly what the content team is doing all day long when they are setting up recipes with a trial and error process

They are not testing the software. They are using released scrapers that are supposed to work and if it doesn't (given it's not a recipe error), they should open a ticket on the scraper's repo. zimit is different in that no one is able to predict the output of a particular scrape. There's no better (accessible) way to do it at the moment. zimit is an exception

Also, expectations towards content team and developers are different 😉

openzim / youtube

urllib3 versions are conflicting #184