maciejhirsz / logos

Create ridiculously fast Lexers
https://logos.maciej.codes
Apache License 2.0
2.71k stars 105 forks source link

Emit Alternate Token From Callback #391

Open UnexDev opened 1 month ago

UnexDev commented 1 month ago

Is it possible to emit an alternate token from a callback (i.e not the lexed token)?

For example, a stack-based white-space-sensetive parser: `fn indent(lex: &mut logos::Lexer) -> / ??? / { let extras = &mut lex.extras; let mut indent = 0;

// maybe use a take_while then a replace?
for char in lex.remainder().chars() {
    if char == '\t' {
        for i in 1..8 {
            if (indent + i) % 8 == 0 {
                indent += i;
                break;
            }
        }
    } else if char == ' ' {
        indent += 1;
    } else {
        break;
    }

    lex.bump(1);
    let last = extras.indent.last();

    if indent < last {
        extras.indent.pop();
       /* emit Dedent token instead of Indent */
    } else if indent > last {
        extras.indent.push(indent);
        /* emit Indent */
    }
}

}`

kotx commented 1 month ago

You can return a Token from your callback. There's also utilities like Filter, Skip, FilterResult, etc. See https://logos.maciej.codes/callbacks.html

UnexDev commented 1 month ago

You can return a Token from your callback. There's also utilities like Filter, Skip, FilterResult, etc. See https://logos.maciej.codes/callbacks.html

Thank you, that worked wonderfully!

Just one more thing - can I emit multiple tokens of the same type? I need to emit multiple dedent tokens. I'm assuming this is not possible?

kotx commented 1 month ago

Just one more thing - can I emit multiple tokens of the same type? I need to emit multiple dedent tokens. I'm assuming this is not possible?

I don't think it's possible, but I decided to use a Dedent(usize) token as a workaround.