yaml / pyyaml

Canonical source repository for PyYAML
MIT License
2.56k stars 518 forks source link

PyYAML 6.0 load() function requires Loader argument #576

Open aviasd opened 3 years ago

aviasd commented 3 years ago

When using an updated version of pyyaml (version 6.0) on Google Colab, there is an import problem in some of Google Colab python packages, like plotly.express:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-86e89bd44552> in <module>()
----> 1 import plotly.express as px

9 frames
/usr/local/lib/python3.7/dist-packages/plotly/express/__init__.py in <module>()
     13     )
     14 
---> 15 from ._imshow import imshow
     16 from ._chart_types import (  # noqa: F401
     17     scatter,

/usr/local/lib/python3.7/dist-packages/plotly/express/_imshow.py in <module>()
      9 
     10 try:
---> 11     import xarray
     12 
     13     xarray_imported = True

/usr/local/lib/python3.7/dist-packages/xarray/__init__.py in <module>()
      1 import pkg_resources
      2 
----> 3 from . import testing, tutorial, ufuncs
      4 from .backends.api import (
      5     load_dataarray,

/usr/local/lib/python3.7/dist-packages/xarray/tutorial.py in <module>()
     11 import numpy as np
     12 
---> 13 from .backends.api import open_dataset as _open_dataset
     14 from .backends.rasterio_ import open_rasterio as _open_rasterio
     15 from .core.dataarray import DataArray

/usr/local/lib/python3.7/dist-packages/xarray/backends/__init__.py in <module>()
      4 formats. They should not be used directly, but rather through Dataset objects.
      5 
----> 6 from .cfgrib_ import CfGribDataStore
      7 from .common import AbstractDataStore, BackendArray, BackendEntrypoint
      8 from .file_manager import CachingFileManager, DummyFileManager, FileManager

/usr/local/lib/python3.7/dist-packages/xarray/backends/cfgrib_.py in <module>()
     14     _normalize_path,
     15 )
---> 16 from .locks import SerializableLock, ensure_lock
     17 from .store import StoreBackendEntrypoint
     18 

/usr/local/lib/python3.7/dist-packages/xarray/backends/locks.py in <module>()
     11 
     12 try:
---> 13     from dask.distributed import Lock as DistributedLock
     14 except ImportError:
     15     DistributedLock = None

/usr/local/lib/python3.7/dist-packages/dask/distributed.py in <module>()
      1 # flake8: noqa
      2 try:
----> 3     from distributed import *
      4 except ImportError:
      5     msg = (

/usr/local/lib/python3.7/dist-packages/distributed/__init__.py in <module>()
      1 from __future__ import print_function, division, absolute_import
      2 
----> 3 from . import config
      4 from dask.config import config
      5 from .actor import Actor, ActorFuture

/usr/local/lib/python3.7/dist-packages/distributed/config.py in <module>()
     18 
     19 with open(fn) as f:
---> 20     defaults = yaml.load(f)
     21 
     22 dask.config.update_defaults(defaults)

TypeError: load() missing 1 required positional argument: 'Loader

When reverting back to pyyaml version 5.4.1, the problem is solved.

bnx05 commented 3 years ago

I ran into this issue as well, had to revert back to 5.4.1 to get our tests working again.

support/frames_parser.py:14: in __init__
    file = yaml.load(open(file_path))
E   TypeError: load() missing 1 required positional argument: 'Loader'
sbesson commented 3 years ago

We got hit by the same error which is related to https://github.com/yaml/pyyaml/pull/561 which was included yesterday in the latest major release of PyYAML 6.0. The simple yaml.load(f) call has been deprecated in the 5.x line in favor of yaml.load(f, Loader=loader). Either capping PyYAML to 5.x or updating all usages of yaml.load should restore your builds.

nitzmahone commented 3 years ago

TBC: this is as-designed. Loading without specifying a loader has been deprecated and issuing loud warnings for the past three years (since 5.1), due to numerous CVEs being filed against the default loader's ability to execute arbitrary(ish) code. Since changing the default to a significantly less-capable loader was just as big a breaking change (and one that could cause more problems in the future), it was decided to just require folks to be specific about the capability they required from the loader going forward.

ingydotnet commented 3 years ago

Reopening this for a spell until the dust settles.

ingydotnet commented 3 years ago

Retitled the issue and pinned it.

cirodrig commented 3 years ago

The documentation still presents the old usage of load, which is now invalid, in quite a few places. I opened an issue on the documentation's repository.

andy-maier commented 3 years ago

@cirodrig I did not find your documentation issue, can you post a link?

I think the documentation should point out what the recommended loader is now that users need to specify it. I guess there is no single recommended loader, so the recommendation should cover which loader is recommended for which use case. I don't find that in the documentation today.

cirodrig commented 3 years ago

@andy-maier I created an issue here https://github.com/yaml/pyyaml.org/issues/15

2sn commented 3 years ago

I find it disappoint, however, that the documentation tutorial examples do not work https://pyyaml.org/wiki/PyYAMLDocumentation

import yaml
>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)
aigarius commented 2 years ago

Yeah, that is not a good solution at all. Breaking 100% of old code is worse than breaking 0.5% of old code that is depending on some of the insecure functionality of the old default loader. Just default to a secure loader as the new default. It is a breaking change, but nowhere near as breaking as this.

The worst is when I have several different packages in my downstream dependencies using PyYAML and some use the new functionality and declare the PyYAML 6 as dependency while others do not provide a loader object to load and just crash now.

Change the PyYAML tutorial for the new syntax at least and let that be up for three years, maybe then it would be ok as a change. But a less destruictive change is always better than more destructive one.

kaivio commented 2 years ago

TBC: this is as-designed. Loading without specifying a loader has been deprecated and issuing loud warnings for the past three years (since 5.1), due to numerous CVEs being filed against the default loader's ability to execute arbitrary(ish) code. Since changing the default to a significantly less-capable loader was just as big a breaking change (and one that could cause more problems in the future), it was decided to just require folks to be specific about the capability they required from the loader going forward.

I don't think it's a good design. It's not in line with the usage habits.

Cerebus commented 2 years ago

Is there a reason why the documentation /still isn't updated a year later/?