osa1 / lexgen

A fully-featured lexer generator, implemented as a proc macro
MIT License
63 stars 7 forks source link

Provide a way to fail with "lexer error" #23

Open osa1 opened 2 years ago

osa1 commented 2 years ago

Currently there's no way to fail with lexgen's "lexer error" when an error is detected. We need to specify a user error type, and fail using a value of that type.

It could be useful to provide a fail method in the lexers to fail with the standard "lexer error" error, without having to define an error type.

Use case: Rust string literals do not allow standalone carriage return, i.e. \r needs to be followed by \n.

One way to implement it is this:

lexer! {
    rule Init {
        ...

        '"' => |lexer| lexer.swtich(LexerRule::String),
    }

    rule String {
        ...

        '\r' => |lexer| lexer.switch(LexerRule::StringCR),

        _,
    }

    rule StringCR {
        '\n' => |lexer| lexer.switch(LexerRule::String),
    }
}

If we had a fail method on lexer structs it could be slightly more concise:

lexer! {
    rule Init {
        ...

        '"' => |lexer| lexer.swtich(LexerRule::String),
    }

    rule String {
        ...

        "\r\n",

        '\r' => |lexer| lexer.fail(),

        _,
    }
}

We don't need another rule to make sure \r is always followed by \n. We can already do this today, but we have to define an error type and return an error: (not tested)

lexer! {
    type Error = UserError;

    rule Init {
        ...

        '"' => |lexer| lexer.swtich(LexerRule::String),
    }

    rule String {
        ...

        "\r\n",

        '\r' =? |lexer| lexer.return_(Err(UserError { ... })),

        _,
    }
}