yaml / pyyaml

Canonical source repository for PyYAML
MIT License
2.57k stars 517 forks source link

ConstructorError regression in version 5.2 #355

Open tgsmith61591 opened 4 years ago

tgsmith61591 commented 4 years ago

The release of 5.2 last night created a regression on deserializing OrderedDicts. Here's a copy/pastable example that works in 5.1.2:

import yaml
from collections import OrderedDict

def yaml_open(path):
    with open(path, 'r') as f:
        data = yaml.load(f, Loader=yaml.FullLoader)
    return data

def yaml_save(payload, path):
    with open(path, 'w') as f:
        yaml.dump(payload, f, default_flow_style=True)

x = {'int': 123, 'dict': OrderedDict(a=1, b=2)}
yaml_save(x, 'payload.yaml')
yaml_open('payload.yaml')
# {'dict': OrderedDict([('a', 1), ('b', 2)]), 'int': 123}

When I try it on 5.2, here's what I get:

---------------------------------------------------------------------------
ConstructorError                          Traceback (most recent call last)
<ipython-input-1-d6eb6f0331bf> in <module>
     16 x = {'int': 123, 'dict': OrderedDict(a=1, b=2)}
     17 yaml_save(x, 'payload.yaml')
---> 18 yaml_open('payload.yaml')
     19 # {'dict': OrderedDict([('a', 1), ('b', 2)]), 'int': 123}

<ipython-input-1-d6eb6f0331bf> in yaml_open(path)
      5 def yaml_open(path):
      6     with open(path, 'r') as f:
----> 7         data = yaml.load(f, Loader=yaml.FullLoader)
      8     return data
      9

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/__init__.py in load(stream, Loader)
    112     loader = Loader(stream)
    113     try:
--> 114         return loader.get_single_data()
    115     finally:
    116         loader.dispose()

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in get_single_data(self)
     41         node = self.get_single_node()
     42         if node is not None:
---> 43             return self.construct_document(node)
     44         return None
     45

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_document(self, node)
     50             self.state_generators = []
     51             for generator in state_generators:
---> 52                 for dummy in generator:
     53                     pass
     54         self.constructed_objects = {}

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_yaml_map(self, node)
    402         data = {}
    403         yield data
--> 404         value = self.construct_mapping(node)
    405         data.update(value)
    406

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_mapping(self, node, deep)
    208         if isinstance(node, MappingNode):
    209             self.flatten_mapping(node)
--> 210         return super().construct_mapping(node, deep=deep)
    211
    212     def construct_yaml_null(self, node):

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_mapping(self, node, deep)
    133                 raise ConstructorError("while constructing a mapping", node.start_mark,
    134                         "found unhashable key", key_node.start_mark)
--> 135             value = self.construct_object(value_node, deep=deep)
    136             mapping[key] = value
    137         return mapping

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_object(self, node, deep)
     90                     constructor = self.__class__.construct_mapping
     91         if tag_suffix is None:
---> 92             data = constructor(self, node)
     93         else:
     94             data = constructor(self, tag_suffix, node)

/anaconda3/envs/carbon-env/lib/python3.6/site-packages/yaml/constructor.py in construct_undefined(self, node)
    418         raise ConstructorError(None, None,
    419                 "could not determine a constructor for the tag %r" % node.tag,
--> 420                 node.start_mark)
    421
    422 SafeConstructor.add_constructor(

ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object/apply:collections.OrderedDict'
  in "payload.yaml", line 1, column 8

Interestingly, the data still serializes properly, so it appears to be an issue in the deserialization only.

perlpunk commented 4 years ago

Yeah, you have to use yaml.UnsafeLoader instead of yaml.FullLoader now, because deserializing !!python/object/apply objects is exploitable and can run arbitrary code.

tgsmith61591 commented 4 years ago

Thanks @perlpunk Is it worth putting that in the error message?

perlpunk commented 4 years ago

Well, the error message is currently generic. So I would have to look at which Loader was used and if it is FullLoader, and how the tag looks like, and if it matches python/object/apply then include a hint in the error message.

The warning that you get when not using a loader explicitly should explain things.

GBR-613 commented 4 years ago

I am working with v5,2 shipped by pip by default, and python v3.8.1 (currently the last one) Each of the following fails with an exception:

    my_yaml = yaml.load(my_file) 
    my_yaml = yaml.load(my_file, Loader=yaml.Loader) 
    my_yaml = yaml.load(my_file, Loader=yaml.FullLoader) 

The following still works:

    my_yaml = yaml.load(my_file, Loader=yaml.BaseLoader) 

It's highly frustrating, because many forums suggest to use FullLoader in case of troubles. At least, BaseLoader which is the working one should be the default.