ikatyang / tree-sitter-yaml

YAML grammar for tree-sitter
https://ikatyang.github.io/tree-sitter-yaml/
MIT License
93 stars 38 forks source link

Schemas support #10

Closed char0n closed 3 years ago

char0n commented 3 years ago

HI @ikatyang,

By inspecting the CST produced by tree-sitter-yaml is seems that it only support Failsafe schema. Is there any plan to include other schemas, like JSON or Core ?

char0n commented 3 years ago

@ikatyang are you planning to support additional standard schemas? I'm currently in the crossroads of implementing schema support as an extension to my tree-sitter-yaml Concrete Syntax Tree -> Abstract Syntax Tree transformer and want to avoid duplicate effort.

ikatyang commented 3 years ago

Yes, I plan to work on it this weekend.

char0n commented 3 years ago

@ikatyang thanks for the work on schema support but I think we have couple of problems with current implementation:

1.) Tag resolution override

Having this YAML fragment I'd expect that the result will be string_scalar instead of integer_scalar.

prop: !<tag:yaml.org,2002:str> 0

The parser needs to abide to the explicit tags.

2.) Using of URI tags

prop1: !<tag:yaml.org,2002:str> 0
prop2: !!str 0

Both URI and Cannonical form must be supported.

3.) Using of TAG Directives

%TAG !yaml! tag:yaml.org,2002:
---
!yaml!str 0

I'd expect to have string_scalar instead of integer_scalar here.

ikatyang commented 3 years ago

The schema support is slightly out of scope of a CST parser, but since the primary use case of tree-sitter is to be used as a syntax highlighter, I think it should be worth implementing the basic schema support if it is not too hard, which is the current implementation.

But the tag resolution seems to be clearly out of scope since the former tag directives can affect the latter tag resolution, and tree-sitter is an incremental CST parser that never backtrack, the case 3 you mentioned above is simply impossible to be implemented in tree-sitter since there is no way to trigger an incremental re-parse for the latter tag resolution if the former tag directive changed.

char0n commented 3 years ago

The schema support is slightly out of scope of a CST parser, but since the primary use case of tree-sitter is to be used as a syntax highlighter, I think it should be worth implementing the basic schema support if it is not too hard, which is the current implementation.

So basically this grammar only asserts the type of value according the Core schema and ignores the tags if the differ from the matched value.

But the tag resolution seems to be clearly out of scope since the former tag directives can affect the latter tag resolution, and tree-sitter is an incremental CST parser that never backtrack, the case 3 you mentioned above is simply impossible to be implemented in tree-sitter since there is no way to trigger an incremental re-parse for the latter tag resolution if the former tag directive changed.

Yeah I suspected that this is going to be an issue. Nevertheless, it was worth the shot. Going to implement TAG resolution algorithm on our side + Schema matching according to resolved tag. It's going to be fun ;]