kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.6k stars 232 forks source link

Type incompatibility between nearly and moo #527

Closed lorefnon closed 3 years ago

lorefnon commented 4 years ago

First of all, thanks for creating this library. Developing with nearley has been such a great experience.


While moo is the recommended lexer for nearley, using moo along with ts postprocessor results in type errors in the generated grammar.

Minimal example:

import moo from "moo";

const l = moo.compile({})

interface NearleyLexer {
  reset: (chunk: string, info: any) => void;
  next: () => NearleyToken | undefined;
  save: () => any;
  formatError: (token: NearleyToken) => string;
  has: (tokenType: string) => boolean;
};

interface Grammar {
  Lexer: NearleyLexer | undefined;
  ParserRules: NearleyRule[];
  ParserStart: string;
};

const grammar: Grammar = {
  Lexer: l,
  ParserRules: [
      // ... 
  ]
}

This fails with following error:

Type 'Lexer' is not assignable to type 'NearleyLexer'.
  Types of property 'formatError' are incompatible.
    Type '(token: Token, message?: string | undefined) => string' is not assignable to type '(token: NearleyToken) => string'.
      Types of parameters 'token' and 'token' are incompatible.
        Type 'NearleyToken' is missing the following properties from type 'Token': offset, text, lineBreaks, line, col

Caveat is that function type parameter positions are checked contravariantly in strict mode in TS, which breaks this.

A simple solution is to convert the generated interface to have methods instead of member properties that are functions, in which bivariant type checking is applied:

interface NearleyLexer {
    reset(chunk: string, info: any): void;
    next(): NearleyToken | undefined;
    save(): any;
    formatError(token: NearleyToken): string;
    has(tokenType: string): boolean;
  };

With the above type in place, the type checker does not complain anymore.

bgschiller commented 4 years ago

I was bit by this as well. In the meantime, this is the fix I've put together (in package.json)

"scripts": {
    "build": "nearleyc ./src/grammar.ne --out ./src/grammar.ts &&  sed -E -i 's/formatError: \\(token: NearleyToken\\)/formatError: (token: moo.Token)/g' src/grammar.ts && tsc"
}

The sed script performs the following change on the generated file:

- formatError: (token: NearleyToken)
+ formatError: (token: moo.Token)
bandaloo commented 3 years ago

Yeah, this is annoying. One solution is to cast:

@{%
import { lexer } from "./lexer";
const nearleyLexer = (lexer as unknown) as NearleyLexer;
%}

@lexer nearleyLexer

This works but it would be nice if we could pass in the type of the lexer and token types:

@preprocessor typescript moo.Lexer moo.Token
kach commented 3 years ago

Sorry — I'm not really a TypeScript person so I don't quite follow the details. But it seems others have had this problem as well. @bandaloo's suggestion seems like a reasonable fix. If anyone wants to implement it I can look at the PR.

bandaloo commented 3 years ago

@kach My PR solves the problem in a similar way to my previous comment but not exactly.