23andMe / Yamale

A schema and validator for YAML.
MIT License
680 stars 89 forks source link

Handle position with a custom class instead of string #60

Closed drmull closed 5 years ago

drmull commented 5 years ago

Add datapath class to replace the 'dotted string' used to indicate the path in the data/schema

mildebrandt commented 5 years ago

Wow, thanks! This initially looks good. I'll need a little more time to go through the whole PR, I'm pretty swamped this week. Can you please fix the PEP8 formatting in the meantime?

drmull commented 5 years ago

Thanks for the comments! The code actually became a bit simpler when I fixed the mutable default arguments issue..

mildebrandt commented 5 years ago

I only looked a little bit at this and haven't taken the time to fully think about it yet. Do you have an understanding how issue #27 would be fixed with this new structure?

drmull commented 5 years ago

Yes, I have been working on a follow-up PR that would fix #27 and #37. The idea I have been working on there is to, instead of flattening the path, keep the lists and maps encountered when processing the schema. Then when traversing the schema the list/map keys can be compared with the ones in the data to see that there are no additional keys present.

mildebrandt commented 5 years ago

That's great, I look forward to seeing what you come up with!

As you make changes, please run a performance comparison between the original code and the new changes. We run this on tens of thousands of yaml files hundreds of times per day...so performance is important for us.

And thanks for your contributions!

sjn-network-automation commented 5 years ago

@drmull thanks. really looking forward to this fix.

drmull commented 5 years ago

@mildebrandt Do you want me to update this PR with the strict mode code or push it as a separate PR? I still got some cleanup and documentation left so not 100% yet.. I do not have access to any large enough data files to do a proper performance comparison. Do you have the possibility to help with that?

mildebrandt commented 5 years ago

Since it's building on top of this work, it might be easier to review if it's all together. So you can use this PR.

Sorry, I don't have anything I can share. I can create something for you next week. If you get to it before I do, I think a schema with a few includes and nested lists/maps of 3-4 levels would be sufficient. The total yaml file size should be around 50kB.

drmull commented 5 years ago

Finally had some time to do the update..

The first commit should also fix #62

mildebrandt commented 5 years ago

Thanks so much! I'll try to carve out some time to look at this.

mildebrandt commented 5 years ago

There's an exception when I try to validate this yaml:

field1:
  field1_2:
    field1_2_1: value
field2: value

with this schema:

field1:
  field1_1:
    field1_1_1: str(required=False)
  field1_2:
    field1_2_1: str(required=False)
field2: str()

The error is:

Error!
Schema: m1.schema
Data file: /Users/chrism/workspace/drmull/Yamale/m1.yaml
Traceback (most recent call last):
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 75, in _validate_item
    data_item = data[key]
KeyError: 'field1_1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/chrism/workspace/drmull/Yamale/yamale/command_line.py", line 29, in _validate
    yamale.validate(schema, data)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/yamale.py", line 38, in validate
    schema.validate(d, path, strict)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 56, in validate
    errors = self._validate(self._schema, data, path, strict)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 98, in _validate
    strict)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 144, in _validate_static_map_list
    key)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 84, in _validate_item
    return self._validate(validator, data_item, path, strict)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 98, in _validate
    strict)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 144, in _validate_static_map_list
    key)
  File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 78, in _validate_item
    if validator.is_optional:  # Optional? Who cares.
AttributeError: 'dict' object has no attribute 'is_optional'
mildebrandt commented 5 years ago

Can you please also add a command line argument for strict mode? Thanks!

drmull commented 5 years ago

Good comments. Fixed the exception problem and added the strict mode flag, also added test cases for both issues.

mildebrandt commented 5 years ago

Thanks for all the hard work @drmull! There was one unexpected change which will cause an incompatibility with previous versions, but we think that may be the right thing in the long run.

In the schema and yaml sample I provided in my comments, all the children of field1_1 were marked as optional which made field1_1 also optional in the current yamale versions. This is probably a side-affect that we didn't want. Your PR changed this behavior, and now it'll fail unless the parent is also marked as optional.

When we merge your PR, we'll be bumping the version to 2.0 since it breaks existing validations. I don't have a date when we'll merge and create the new release, but we are excited for strict mode....so it won't be too long. :)

drmull commented 5 years ago

In the schema and yaml sample I provided in my comments, all the children of field1_1 were marked as optional which made field1_1 also optional in the current yamale versions. This is probably a side-affect that we didn't want. Your PR changed this behavior, and now it'll fail unless the parent is also marked as optional.

Hm, I see. I missed that it worked that way. Spontaneously I think it will be a bit clunky to add that behavior in the code as it looks now so if you are OK with the change as it is then I am happy :+1: