mxxii / leac

Lexer / tokenizer
MIT License
6 stars 1 forks source link
lexer lexer-generator tokenizer

leac

lint status badge test status badge License: MIT npm npm deno

Lexer / tokenizer.

Features

Changelog

Available here: CHANGELOG.md

Install

Node

> npm i leac
> yarn add leac
import { createLexer, Token } from 'leac';

Deno

import { createLexer, Token } from 'https://deno.land/x/leac@.../leac.ts';

Examples

const lex = createLexer([
  { name: '-', str: '-' },
  { name: '+' },
  { name: 'ws', regex: /\s+/, discard: true },
  { name: 'number', regex: /[0-9]|[1-9][0-9]+/ },
]);

const { tokens, offset, complete } = lex('2 + 2');

Published packages using leac

API

A word of caution

It is often really tempting to rewrite token on the go. But it can be dangerous unless you are absolutely mindful of all edge cases.

For example, who needs to carry string quotes around, right? Parser will only need the string content...

We'll have to consider following things:

When put together, these things plus some intuition traps can lead to a broken array of tokens.

Strings can be empty, which means the token can be absent. With no content and no quotes the tokens array will most likely make no sense for a parser.

How to avoid potential issues:

Another note about quotes: If the grammar allows for different quotes and you're still willing to get rid of them early - think how you're going to unescape the string later. Make sure you carry the information about the exact string kind in the token name at least - you will need it later.

What about ...?

Some other lexer / tokenizer packages