alecthomas / voluptuous

CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.
https://pypi.org/project/voluptuous
BSD 3-Clause "New" or "Revised" License
1.81k stars 219 forks source link

MultipleInvalid errors order is not always the same #484

Open arnaud-soulard opened 1 year ago

arnaud-soulard commented 1 year ago

When validating this schema:

Schema1 = Schema({
    Required('value1'): Match2(r'^[\d\w_-]{1,64}$'),
    Required('value2'): Any(IntField(), DecimalField(precision=3), TemplateString(), str)
})

With this test:

def test_schema1_empty():
    with pytest.raises(MultipleInvalid) as exception_info:
        Schema1({})
    assert len(exception_info.value.errors) == 2
    assert str(exception_info.value.errors[0]) == "required key not provided @ data['value1']"
    assert str(exception_info.value.errors[1]) == "required key not provided @ data['value2']"

The test fails 1 over 3-4 times. I found it is related to the use of a set, that is unordered in python in schema_builder.py

Current code:

    def _compile_mapping(self, schema, invalid_msg=None):
        """Create validator for given mapping."""
        invalid_msg = invalid_msg or 'mapping value'

        # Keys that may be required
        all_required_keys = set(key for key in schema
                                if key is not Extra
                                and ((self.required
                                      and not isinstance(key, (Optional, Remove)))
                                     or isinstance(key, Required)))

        # Keys that may have defaults
        all_default_keys = set(key for key in schema
                               if isinstance(key, Required)
                               or isinstance(key, Optional))

My proposal (that works for me) would be to use a dict instead of a set (to preserve validations errors order):

    def _compile_mapping(self, schema, invalid_msg=None):
        """Create validator for given mapping."""
        invalid_msg = invalid_msg or 'mapping value'

        # Keys that may be required

        all_required_keys = {}
        for key in schema:
            if key is not Extra \
                and ((self.required \
                and not isinstance(key, (Optional, Remove))) \
                or isinstance(key, Required)):
                    all_required_keys[key] = key

        # Keys that may have defaults
        all_default_keys = {}
        for key in schema:
            if isinstance(key, Required) or isinstance(key, Optional):
                all_default_keys[key] = key
spacegaier commented 11 months ago

Just to confirm I correctly got your point. When you say "The test fails 1 over 3-4 times." you mean it fails every 3 to 4 times you run it, presumably because exception_info.value.errors[0] sometimes contains "required key not provided @ data['value1']" and sometimes ""required key not provided @ data['value2']"?

arnaud-soulard commented 11 months ago

exactly, it is related to the order