jpmckinney / validictory

🎓 deprecated general purpose python data validator
Other
240 stars 57 forks source link

min- / maxItems for dicts #101

Closed Heiko-san closed 8 years ago

Heiko-san commented 8 years ago

I could have used that feature a several times lately. If you have the case that at a layer you can't say how the keys will be named and have to work with additionalProperties and/or patternProperties your schema will also pass if there is not a single property present at the data. Or if you have properties that all have required=False because they are interchangeable.

However this is not always the intended behavior in most cases you would like to express "there must be at least 1".

The only intuitive solution I found in the docs is to add minItems and/or maxItems, which right now seem to only work for arrays and be ignored in a type object layer.

jamesturk commented 8 years ago

Hi, I'd be open to a making this work if you would be able to find some time to come up with some test cases for the expected behavior.

bmdemouser commented 8 years ago

A real world example can be found in the code from #100:

...
                    'modules': {
                        'additionalProperties': False,
                        'type': 'object',
                        'properties': {
                            'statuscode': {
                                'required': False,
                                ...
                            },
                            'contentcheck': {
                                'required': False,
                                ...
                            },
                            'myothercheck': {
                                'required': False,
                                ...
                            },
...

This schema is actually generated by code where available test modules and their corresponding configuration validation schema is filled in. The whole thing should validate a config file and the user should be free to choose test modules from the set of available modules, but they should need to configure at least one.

I guess test cases will come down to something like:

#!/usr/bin/env python
import validictory

schema1 = {
    'type': 'object',
    'additionalProperties': False,
    'patternProperties': {
        '^(one|two|three)$': {
            'type': 'integer'                                                                                                                                                                                        
        }                                                                                                                                                                                                            
    },                                                                                                                                                                                                               
    'minItems': 1,                                                                                                                                                                                                   
    'maxItems': 2,                                                                                                                                                                                                   
}                                                                                                                                                                                                                    

schema2 = {                                                                                                                                                                                                          
    'type': 'object',                                                                                                                                                                                                
    'additionalProperties': False,                                                                                                                                                                                   
    'properties': {                                                                                                                                                                                                  
        'one': {                                                                                                                                                                                                     
            'required': False,                                                                                                                                                                                       
            'type': 'integer'
        },
        'two': {
            'required': False,
            'type': 'integer'
        },
        'three': {
            'required': False,
            'type': 'integer'
        }
    },
    'minItems': 1,
    'maxItems': 2,
}

testdata = {
    'should_pass': {
        'one': 1
    },
    'should_also_pass': {
        'two': 2
    },
    'should_also_pass_multi_value': {
        'one': 1,
        'three': 3
    },
    'should_fail_too_many': {
        'one': 1,
        'two': 2,
        'three': 3
    },
    'should_fail_too_few': {
    },
    'should_fail_unknown_item': {
        'four': 4
    },
}

for schema in (schema1, schema2):
    print("--------------------------")
    for name in sorted(testdata):
        try:
            validictory.validate(testdata[name], schema)
        except validictory.ValidationError as e:
            print('{0} failed with: {1}'.format(name, e))
            continue
        print('{0} passed'.format(name))

This runs on python 2 and 3 and right now produces this output:

--------------------------
should_also_pass passed
should_also_pass_multi_value passed
should_fail_too_few passed
should_fail_too_many passed
should_fail_unknown_item failed with: Value {'four': 4} for field '<obj>' contains additional property 'four' not defined by 'properties' or 'patternProperties' and additionalProperties  is False
should_pass passed
--------------------------
should_also_pass passed
should_also_pass_multi_value passed
should_fail_too_few passed
should_fail_too_many passed
should_fail_unknown_item failed with: Value {'four': 4} for field '<obj>' contains additional property 'four' not defined by 'properties' or 'patternProperties' and additionalProperties  is False
should_pass passed
Heiko-san commented 8 years ago

Sorry for posting with the wrong account, it was me ... ^^'

I guess leaving out the additionalProperties False, the behavior should very much be the same except for "should_fail_unknown_item" should pass then, since it is within the range of 1 to 2 items / key-value-pairs.

Actually I think all that has to be done is to check minItems / maxItems against len(dict) if they are present. Optionally designing new keywords for this in case this could harm existing constructs like:

...
"type": ["array", "object"],
"minItems": 1
...

"minProperties" and "maxProperties" would describe the behavior accurately and be within the existing naming scheme, I'd suggest.

Heiko-san commented 8 years ago

Truns out (after having a deeper look into your code) this is done really easily, I was just experimenting:

hfi@hfi-UX305LA ~/git/validictory $ ./test.py 
--------------------------
should_also_pass passed
should_also_pass_multi_value passed
should_fail_too_few failed with: Value {} for field '<obj>' must have number of properties greater than or equal to 1
should_fail_too_many failed with: Value {'three': 3, 'two': 2, 'one': 1} for field '<obj>' must have number of properties less than or equal to 2
should_fail_unknown_item failed with: Value {'four': 4} for field '<obj>' contains additional property 'four' not defined by 'properties' or 'patternProperties' and additionalProperties  is False
should_pass passed
--------------------------
should_also_pass passed
should_also_pass_multi_value passed
should_fail_too_few failed with: Value {} for field '<obj>' must have number of properties greater than or equal to 1
should_fail_too_many failed with: Value {'three': 3, 'two': 2, 'one': 1} for field '<obj>' must have number of properties less than or equal to 2
should_fail_unknown_item failed with: Value {'four': 4} for field '<obj>' contains additional property 'four' not defined by 'properties' or 'patternProperties' and additionalProperties  is False
should_pass passed

I will try to write some test for it and issue another pull request :)

PS I'm using "minProperties" and "maxProperties" now

jamesturk commented 8 years ago

wanted to confirm that min/max properties fixes this one, OK to close?