Closed wgeraghty closed 4 months ago
We use https://github.com/yaml/yaml-test-suite to validate yaml-rust2. If this is in-spec then this could be made into a test file in that repo.
If this is out of spec.. (sorry, I don't know whether it is, but it very well might be) what are your thoughts @Ethiraric ~ is adding booleans to YamlLoader to opt-in to non-standard behavior something we should entertain on a case-by-case basis?
@wgeraghty are these files from a common tool? Are the files available in a public repo anywhere?
This construct is invalid. I am on mobile and can't check right now which part of the specification this violates, but the yaml playground errors out with this specific input.
This now brings interesting and hard questions about yaml compliance. Technically, the bug report should be filed to whoever distributed the yaml. But fixing the tag issue from #10 likely would break most yaml parsers, due to most yaml parsers not being compliant.
I would like this library to still be usable as a general yaml parser, but we need to carefully think about which non-compliant constructs we want to allow and disallow. Each construct will add branches and slow the code down a tiny bit, even for those who had fully compliant yamls.
Rust features could allow a good middle ground between the 2, but more and more feature toggles exponentially increase the complexity of testing (a test may succeed with a set of features and fail with another).
I'm definitely okay with fixing #10. I guess that it's a common "mistake". This one I don't know. Is the source for this open? Could it be fixed there? Do they have a specification for their format?
I have not been able to find the spec for the Unity yaml format; their parser is not open. There is a massive amount of content generated already that uses this format so they are unlikely to change it.
I wasn't sure if this change made sense to put behind a yaml_rust2::parser::Parser
flag, a feature toggle, or a fork.
How exactly to handle that depends on how fragmented the YAML flavours are across all tools. It requires knowledge that I currently do not have.
Your issue sparks this discussion, and I think I'll need to ask around a bit about the YAML ecosystem before coming to a conclusion.
In an ideal world I'd have a way to just get a copy of reference files from every tool and just feed them into the parser to see what is most common and what isn't, what is fixable and what isn't, and so on. I'll make my very best to do this ASAP, but it may take me a week to a month to get all that data.
If you depend on this feature immediately, I'd unfortunately suggest you patch the code for now. I think forking in the worst solution on the long term, but it might be a solution on the short term (unless your build process allows for integrating this as a submodule, using yaml-rust2 = { version = "0.7.0", path = "./dependencies/yaml-rust2" }
in your Cargo.toml
and applying a patch). How would that sound with you?
I can modify the build process so a submodule should be good. Thanks for looking into this.
You're welcome!
By the way, this appears to be a pretty Unity-specific thing to do. I tried other parsers and they all error out with your input. None of them like the space in the tag.
Here are a couple of related forks that I believe might've been created to address this use case:
Heads-up @xmxu @ilinchunjie ~ I'm pinging since I think this topic might be of interest to you.
This will not be implemented in yaml-rust2
. There could be a crate under saphyr-rs
to parse Unity files, but we'll have to finalize the crate before.
Results in
ScanError { mark: Marker { index: 53, line: 4, col: 3 }, info: "mapping values are not allowed in this context" }
.I'm not sure if this is in the spec but the yaml files I'm working with have it. My workaround for now is changing
while is_tag_char(self.look_ch())
towhile !is_breakz(self.look_ch())
. This doesn't feel quite right but it allows me to use a customMarkedEventReceiver
to do what I need.