Open nikolaik opened 1 year ago
It seems to run with the following changes https://github.com/python-jsonschema/check-jsonschema/pull/223
Thanks for raising this, and sharing the initial crack at an implementation in #223 ! I need to think about this more to figure out what the right thing to do is.
One thing which we'd need to consider is that a document can contain a top-level array, like
- foo
- bar
and you can have a schema which expects/requires this. So that use-case needs to be supported and handled separately from loading multiple documents.
I'm also going to have to think about the output, and how it could be clarified in such a case.
We could replace the document name with the name + index, like yolo.yaml:0:$: ErrorString
(here's where the formatting with ::
happens today).
Or maybe it should be the line number where the document started, yolo.yaml:8:$: ErrorString
... 🤔
I'm receptive to this, especially with someone putting in the effort to show how it could be done (again, thanks!) but need to think harder about it to decide what is right.
Any update on this? 😅
Reporting wise IMHO <file_name>:<line_number>
is better since it can be used in editors to jump directly to the code line that caused the error.
I haven't been able to make time for it, unfortunately. My time for check-jsonschema has been spent on other issues, but it's on my radar as a "would be nice". At check-jsonschema
's currently level of popularity, I'd call this relatively high interest (multiple :+1:s + a follow-up comment), which is relevant for my prioritization.
It's good to know that <filename>:<lineno>: <message>
is a useful format for a specific purpose. I think we can call that part settled.
The part that's not clear to me, circling back to this thread, is what the CLI interface should be for this to disambiguate the following two files:
# list.yaml
- foo: bar
- bar: baz
vs
# multiobject.yaml
---
foo: bar
---
bar: baz
And how should it behave if you run this?
check-jsonschema --schemafile foo.json list.yaml multiobject.yaml
I think the right thing is that all YAML files be treated as multi-item. Is that all we need? I'm trying to make sure we can describe the behavior well, and it should be easy enough to write code to that spec.
I think the right thing is that all YAML files be treated as multi-item. Is that all we need?
Actually, @sirosen I think about this a bit differently.
To me, list.yaml
and multiobject.yaml
aren't the same.
The former is 1 "object" (?) of "root type" List, with an "element type" of "something than can be foo
or bar
" (which seems un-usual).
The latter however is 2 (!) "objects" of the same "root type" as what's the "element type" of the former. Does this make sense?
So the way I understand things, a YAML with multiple documents is just a list of N instead of 1 "roots", which one considers to (have to be) "of the same type" for such a validation.
When this will be implemented, I could (try to) validate e.g. this with it.
To me,
list.yaml
andmultiobject.yaml
aren't the same.The former is 1 "object" (?) of "root type" List, with an "element type" of "something than can be
foo
orbar
" (which seems un-usual).The latter however is 2 (!) "objects" of the same "root type" as what's the "element type" of the former. Does this make sense?
Yes, that matches my understanding as well. One file contains a single document, a list, while the other is a file containing two documents, each of which is an object.
I haven't looked at implementing this in a while, but IIRC the ruamel.yaml interface for loading multiple documents makes these hard to distinguish. As long as we don't "flatten" those two files to be identical once loaded, we can implement some change -- probably without issue.
A kludge:
yq ea -j '[.]' document-stream.yaml \
| check-jsonschema --schemafile schema.json -
Is there any interest in adding support for checking yaml files with multiple documents in them? That is:
catalog-info.yaml:
Ref https://yaml.org/spec/1.2.2/#22-structures