Closed drmull closed 5 years ago
Wow, thanks! This initially looks good. I'll need a little more time to go through the whole PR, I'm pretty swamped this week. Can you please fix the PEP8 formatting in the meantime?
Thanks for the comments! The code actually became a bit simpler when I fixed the mutable default arguments issue..
I only looked a little bit at this and haven't taken the time to fully think about it yet. Do you have an understanding how issue #27 would be fixed with this new structure?
Yes, I have been working on a follow-up PR that would fix #27 and #37. The idea I have been working on there is to, instead of flattening the path, keep the lists and maps encountered when processing the schema. Then when traversing the schema the list/map keys can be compared with the ones in the data to see that there are no additional keys present.
That's great, I look forward to seeing what you come up with!
As you make changes, please run a performance comparison between the original code and the new changes. We run this on tens of thousands of yaml files hundreds of times per day...so performance is important for us.
And thanks for your contributions!
@drmull thanks. really looking forward to this fix.
@mildebrandt Do you want me to update this PR with the strict mode code or push it as a separate PR? I still got some cleanup and documentation left so not 100% yet.. I do not have access to any large enough data files to do a proper performance comparison. Do you have the possibility to help with that?
Since it's building on top of this work, it might be easier to review if it's all together. So you can use this PR.
Sorry, I don't have anything I can share. I can create something for you next week. If you get to it before I do, I think a schema with a few includes and nested lists/maps of 3-4 levels would be sufficient. The total yaml file size should be around 50kB.
Finally had some time to do the update..
The first commit should also fix #62
Thanks so much! I'll try to carve out some time to look at this.
There's an exception when I try to validate this yaml:
field1:
field1_2:
field1_2_1: value
field2: value
with this schema:
field1:
field1_1:
field1_1_1: str(required=False)
field1_2:
field1_2_1: str(required=False)
field2: str()
The error is:
Error!
Schema: m1.schema
Data file: /Users/chrism/workspace/drmull/Yamale/m1.yaml
Traceback (most recent call last):
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 75, in _validate_item
data_item = data[key]
KeyError: 'field1_1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/chrism/workspace/drmull/Yamale/yamale/command_line.py", line 29, in _validate
yamale.validate(schema, data)
File "/Users/chrism/workspace/drmull/Yamale/yamale/yamale.py", line 38, in validate
schema.validate(d, path, strict)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 56, in validate
errors = self._validate(self._schema, data, path, strict)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 98, in _validate
strict)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 144, in _validate_static_map_list
key)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 84, in _validate_item
return self._validate(validator, data_item, path, strict)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 98, in _validate
strict)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 144, in _validate_static_map_list
key)
File "/Users/chrism/workspace/drmull/Yamale/yamale/schema/schema.py", line 78, in _validate_item
if validator.is_optional: # Optional? Who cares.
AttributeError: 'dict' object has no attribute 'is_optional'
Can you please also add a command line argument for strict mode? Thanks!
Good comments. Fixed the exception problem and added the strict mode flag, also added test cases for both issues.
Thanks for all the hard work @drmull! There was one unexpected change which will cause an incompatibility with previous versions, but we think that may be the right thing in the long run.
In the schema and yaml sample I provided in my comments, all the children of field1_1
were marked as optional which made field1_1
also optional in the current yamale versions. This is probably a side-affect that we didn't want. Your PR changed this behavior, and now it'll fail unless the parent is also marked as optional.
When we merge your PR, we'll be bumping the version to 2.0 since it breaks existing validations. I don't have a date when we'll merge and create the new release, but we are excited for strict mode....so it won't be too long. :)
In the schema and yaml sample I provided in my comments, all the children of
field1_1
were marked as optional which madefield1_1
also optional in the current yamale versions. This is probably a side-affect that we didn't want. Your PR changed this behavior, and now it'll fail unless the parent is also marked as optional.
Hm, I see. I missed that it worked that way. Spontaneously I think it will be a bit clunky to add that behavior in the code as it looks now so if you are OK with the change as it is then I am happy :+1:
Add datapath class to replace the 'dotted string' used to indicate the path in the data/schema