swagger-api / apidom

Semantic parser for API specifications
https://swagger-api.github.io/apidom/
65 stars 14 forks source link

Consider using @tree-sitter-grammars/tree-sitter-yaml as a replacement for tree-sitter-yaml #4033

Open char0n opened 2 months ago

char0n commented 2 months ago

Current state of our YAML 1.2 lexical analysis is described in https://github.com/swagger-api/apidom/issues/194#issuecomment-1602399929.

The goal of this issue is to determine if we can use https://github.com/tree-sitter-grammars/tree-sitter-yaml, as a drop it replacement. In https://github.com/tree-sitter-grammars/tree-sitter-yaml. In https://github.com/tree-sitter-grammars/tree-sitter-yaml/commit/ee09311, the new grammar added support for error recovery, which was blocking us from upgrading tree-sitter infrasturcture.

Refs https://github.com/swagger-api/apidom/issues/194 Refs https://github.com/tree-sitter/tree-sitter/issues/2339


This issue deals with tree-sitter GitHub org recommending new YAML grammar: https://github.com/tree-sitter/tree-sitter/issues/3005

char0n commented 2 months ago

After thorough testing I can say that new grammar is mostly compatible with what we need in ApiDOM. Node.js bindings and compiled WASM integrates with other tree-sitter libraries. One incompatibility I've found is:

        asyncapi: 2.4.0
        info:
          version: '1.0.0'
          title Something # Missing mapping

With old YAML grammar and old tree-sitter we correctly get error just on line 4. In new grammar and new tree-sitter we get Error Node for entire parsed string, and only within that Error Node there are other parsed nodes.

npm scripts need to locally build the grammar:

  "scripts": {
    "build": "tree-sitter generate --no-bindings ./grammar.js",
    "build:wasm": "tree-sitter build --wasm --output ./build/tree-sitter-yaml.wasm .",
    "postbuild": "npm run --prefix schema/json build",
    "test": "tree-sitter test",
    "install": "node-gyp rebuild",
    "prebuildify": "prebuildify --napi --strip"
   }