jpsim / Yams

A Sweet and Swifty YAML parser.
https://jpsim.com/Yams
MIT License
1.11k stars 144 forks source link

Anchors with circular references #274

Open tjprescott opened 4 years ago

tjprescott commented 4 years ago

We have a generated YAML file that looks like this snippet:

schemas:
  objects:
    - &ref_0
      type: object
      apiVersions:
        - version: 0.0.0
      parents:
        all:
          - &ref_2
            type: object
            apiVersions:
              - version: 0.0.0
            children:
              all:
                - *ref_0
              immediate:
                - *ref_0

YAMS is unable to parse this, though we are able to parse this successfully in other languages we support (C#, Java, Python).

cc/ @sacheu

tjprescott commented 4 years ago

The error thrown is:

(ERROR) dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid YAML.", underlyingError: Optional(154:19: error: composer: found undefined alias:
                - *ref_0
                  ^)))
tjprescott commented 4 years ago

The offending code begins at line 278 in Parser.swift:

    func loadAlias(from event: Event) throws -> Node {
        guard let alias = event.aliasAnchor else {
            fatalError("unreachable")
        }
        guard let node = anchors[alias] else {
            throw YamlError.composer(context: nil,
                                     problem: "found undefined alias", event.startMark,
                                     yaml: yaml)
        }
        return node
    }

If there were a way for the call to loadAlias in loadNode to catch the undefined alias error and save it as an unresolved reference, then once the document has been parsed, it can go back through the unresolved references and see if it can resolve them. At that point, if there are any remaining unresolved references, it should throw an error.

cc/ @sacheu

tjprescott commented 4 years ago

Alternatively, if the parser could simply do an initial pass and NOT throw an error on unresolved aliases, then the alias would be constructed. If the parser could then reparse the document using the anchors from the previous pass, there shouldn't be any unresolved aliases.

jpsim commented 4 years ago

It seems like other parsers also fail on the sample input you provided.

yamllint.com experiences an internal server error when you give it that input. It also crashes when giving this simpler circular reference:

- &ref_0
  parents:
    - *ref_0
jpsim commented 4 years ago

This online tool also fails: codebeautify.org/yaml-validator

With this input:

- &ref_0
  parents:
    - *ref_0

It produces this error:

Error : Reference "ref_0" does not exist. Line : - *ref_0 undefined

jpsim commented 4 years ago

I'm looking at how this javascript yaml parser added support for circular references: https://github.com/eemeli/yaml/pull/84

I'm also thinking that we'd need to change quite a few of the internals of Yams would need to change to allow for references rather than our current value-type based approach.

Finally, I'm trying to think through the implications of our Codable support and how that would interoperate with a reference-based approach. This thread has some useful discussion around that: https://forums.swift.org/t/codable-with-references/13885/4

tjprescott commented 4 years ago

I have a draft PR in my local fork that is mostly working for me. https://github.com/tjprescott/Yams/pull/1

jpsim commented 4 years ago

Interesting take. Thanks for taking charge on it.

Are you planning on supporting decoding Codable types with references or would that be unsupported?

It might be possible to check to see if the type being decoded is a reference type and only decode circular references in that case.

tjprescott commented 4 years ago

We are using this in our code generator and making heavy use of Codable. Right now it seems to be working for our purposes (we are unblocked) but I'm sure there are edge cases we aren't detecting.

jpsim commented 4 years ago

I’m glad to hear that! I’ll happily review a PR once you’re happy with your solution.

jpsim commented 3 years ago

Just out of curiosity where did you land with this @tjprescott ?

tjprescott commented 3 years ago

We have a hacky workaround in my fork that we are using. It by no means is a general solution. I think that would require, as you noted, a fundamental change from using structs to classes, but it did unblock our scenario. It's the same PR linked above and we just take a source dependency on that branch.

jpsim commented 3 years ago

Yeah that makes sense. Let me know if you can think of a good way that Yams could offer hooks into its parsing to allow users to do this kind of thing without having to fork the project.