23andMe / Yamale

A schema and validator for YAML.
MIT License
666 stars 88 forks source link

Validate unique key values in list of dictionaries #201

Closed nbaju1 closed 2 years ago

nbaju1 commented 2 years ago

I would like to be able to validate the uniqueness of key values in a list of dictionaries as well having normal validation for strings, floats etc. for these key values.

Example of .yml that should fail this kind of validation:

pets:
  - name: my_favourite_pet
    type: cat
  - name: my_favourite_pet
    type: dog

I'm able to make a custom validator to do this, but then I'm not able to apply other validations to 'name' and 'type' (like string validation for 'name' and enum for 'type'):

class Pets(Validator):
    """ Custom pets validator """
    tag = "pets"

    def _is_valid(self, value):
        pet_names = [pet['name'] for pet in value]
        if len(value) == len(set(pet_names)):
            return True
        return False

    def fail(self, value):
        return "Pet names are not unique."

Where the schema for this would be:

pets: pets()

Any tips on how to accomplish this, except wirting a validator that includes the default validators that I want to include?

mechie commented 2 years ago

You could override validators.List to add a new Constraint, something like UniqueItemsByProperty that checks each item on the list for a property, maybe like:

def _is_valid(self, value):
    if not value or not self.unique_property:
        return None
    errors = []
    seen = set()
    for item in value:
        property = getattr(item, self.unique_property, None) if item else None
        if property in seen:
            errors.append(self._fail(property, value))
        seen.add(property)
    return errors

Then list(include('Pet'), unique_property='name'). The same could be done with a kwarg in the List directly, if you prefer a smaller footprint.

nbaju1 commented 2 years ago

Thank you for the suggestion @mechie, that worked like a charm :)

My implementation is slightly different as 'value' is a dictionary. Any suggestions for improvements is welcome!

class UniqueItemsByProperty(Constraint):
    keywords = {'unique_property': str}

    def _is_valid(self, value):
        if not value or not self.unique_property:
            return True
        seen = set()
        for item in value:
            if self.unique_property in item.keys():
                property = item[self.unique_property]
            if property in seen:
                self.fail = f"Property '{self.unique_property}' is not unique. Duplicate value '{property}'"
                return False
            seen.add(property)
        return True

    def _fail(self, value):
        return self.fail

...

validators = DefaultValidators.copy()
validators[List.tag].constraints.append(UniqueItemsByProperty)
mildebrandt commented 2 years ago

Thanks for your help @mechie !