Closed robert-sjoblom closed 3 months ago
Hmm. While the specification allows for comments, it is clearly stated that they must not convey information. When testing with the yaml-test-suite
, comments cannot emit parser events, meaning that they are unavailable to the YamlLoader
. We could add an event variant (yay, another API breaking change) and ignore it in the yaml-test-suite
test.
If you want to tackle this:
scanner.rs
. The scanner currently ignores all comments. You need a new scanner Event
variant for comments. Search for occurences of #
in the file and you should find all places where comments are encountered.Event
.Loader
to take that into account. You will probably have to write another YamlWithComment
enum since you can't for instance have
- a
# This is b
- b
be an array with
Yaml::String("a")
Yaml::Comment("This is b")
Yaml::String("b")
as this would mean the comment conveys "meaning" (it shifts indices in the array).
Then:
yaml-test-suite
test to accomodate for this change (and any other failing test)To be fair (and as a warning), with my knowledge of the codebase, I'd expect this to take me at least a couple days to implement.
Do you have a reference for another parser (even in another language) that implements this feature?
To be fair (and as a warning), with my knowledge of the codebase, I'd expect this to take me at least a couple days to implement.
I'll gladly work on this, it'll take some time but it's not as if anyone else has asked for it.
Do you have a reference for another parser (even in another language) that implements this feature?
I do -- it's what we've been forced to use due to every rust yaml crate being slightly off in what we want (yaml-rust doesn't honor whitespace (blank newlines) and strips comments, another crate honors whitespace but strips comments, a third honors comments but strips indentation...): https://pypi.org/project/ruamel.yaml/
They have indeed added a Comment
scanner event. In Python, they can dynamically add fields (which they appear to use for comments). Doing so in Rust would incur a cost that Rustaceans will probably hardly accept.
I don't really see an easy way around that. On the one hand, I don't like duplicating the Yaml
structure, but on the other hand I don't see how CommentedYaml
could easily re-use Yaml
. I think I'll leave up to you to find out ^^
I'll gladly work on this, it'll take some time but it's not as if anyone else has asked for it.
Thank you very much for your dedication! :heart: Also, maybe others don't ask but would use the feature.
Following https://github.com/Ethiraric/yaml-rust2/issues/17#issuecomment-2032585651, you might want to turn your contributions against https://github.com/saphyr-rs/saphyr-parser and https://github.com/saphyr-rs/saphyr. They contain the same code as here except the name has changed (and saphyr has the parser as dependency). I'll see into transferring your issue once Github allows me.
I'll gladly work on this, it'll take some time but it's not as if anyone else has asked for it.
The lack of this feature seems to prevent yaml-rust2 from correctly parsing sublime-syntax files, e.g.: https://github.com/sublimehq/Packages/blob/fa6b8629c95041bf262d4c1dab95c456a0530122/Makefile/Makefile.sublime-syntax
This in turn prevents adoption in the syntect crate.
What exactly is the error you encounter? The file should parse fine if it is valid yaml. Do you depend on the comments in your code?
Trying to parse that file with YamlLoader::load_from_str
errors with:
called `Result::unwrap()` on an `Err` value: ScanError { mark: Marker { index: 10, line: 2, col: 0 }, info: "comments must be separated from other tokens by whitespace" }
EDIT: I might have wrongly assumed that this is related to this feature, but still probably shouldn't happen? Syntect seems to be parsing this file correctly with the old yaml-rust crate
That appears to be an issue with the parser indeed.
It is however unrelated to that specific Github issue which aims at adding comments to the Yaml
object.
From what I saw in the yaml parser playground, the file appears to be valid. I'll try and have a look during the weekend, otherwise later week.
This is fixed in 0eb7936e1e85553031dbcb99fa8837be3484c2ee.
It was specifically an issue with comments on column 0 on the line immediately following a tag directive ^^"
@robert-sjoblom have you started working on the feature?
I have embarked on a heavy refactoring in the scanner in order to support borrowing in saphyr-parser.
I have not; I've been somewhat swamped both at work and in my "spare" time (as the lack of a reply indicates). I hope that I'll get some time next week, but ultimately I suspect this'll be a slow-moving feature, what with family/work taking up most of my time.
The 1.2 spec of yaml allows comments: https://yaml.org/spec/1.2.2/#66-comments
One current issue with yaml-rust2 (and, to be fair, the original yaml-rust crate) is that it strips all comments. Thus, the following (shortened) example doesn't work:
foo.yaml
I would expect the output to contain my new entry and the comment
# comment here
. That's not what happens, though, since comments are lost.I'd gladly help out with this, although I'm unsure of where to start.