zesterer / chumsky

Write expressive, high-performance parsers with ease.
https://crates.io/crates/chumsky
MIT License
3.64k stars 155 forks source link

Add a `spanned` method for automatically wrapping the parser result in a `Span` #640

Open TheOnlyTails opened 5 months ago

TheOnlyTails commented 5 months ago

This would be a very simple utility method I feel is missing right now, as a shorthand:

parser.map_extra(|it, e| (it, e.span()))
// into
parser.spanned()

It's not essential or anything, but it'll definitely be nice to have.

zesterer commented 5 months ago

I've considered this a few times, but in practice it's far from a convention to represent the two as a tuple: folks often create a struct Spanned<T>(T, Span); type that implements Deref<Target = T> for the same purpose.

I am tempted to make such a Spanned type a part of chumsky's API though. Would that suit you?

TheOnlyTails commented 5 months ago

In my opinion, a library taking a position in cases like this usually makes it easier to solve problems and reduces fragmentation, so I'm all in favor of an official solution.

zesterer commented 5 months ago

What worries me about this is that such a Spanned type would effectively become a pervasive part of a user's compiler since it would be present in the AST too. Perhaps this isn't a problem in itself, but chumsky tries very hard to be domain-specific, not being opinionated about other aspects of a compiler.

TheOnlyTails commented 4 months ago

IMO that's a good thing, it encourages and makes it easier to do good error reporting, since you can access the span right then and there. Besides, it's opt-in only for those who use the method.

mlgiraud commented 4 months ago

Is there any case where Spanned is something else than a token or element, and a span? In the current version of chumsky i also don't see an easy way to create a custom spanned type in a tokenizer that is then consumed by a parser, since the Input trait is sealed, and the SpannedInput type requires tuples as tokens. So either it would make sense to have a Spanned type that is provided by the library, or to have a trait that needs to be implemented by a "Spanned" type such that chumsky can handle it, right? Imho i believe both would make sens, i.e. providing a trait and a default "Spanned" type that implements this trait.

zesterer commented 1 month ago

or to have a trait that needs to be implemented by a "Spanned" type such that chumsky can handle it, right? Imho i believe both would make sens, i.e. providing a trait and a default "Spanned" type that implements this trait.

The problem with a trait is that you then end up needing to specify the implementor you want, adding more type annotations (and potentially confusing type errors)

One option might be to tie it to the Span trait itself. For example:

pub trait Span {
    type Spanned<T>;

    fn make_spanned<T>(self, item: T) -> Self::Spanned<T>;
}

...

pub struct Spanned<T, S = SimpleSpan>(T, S);

impl Span for SimpleSpan {
    type Spanned<T> = Spanned<T>;

    fn make_spanned<T>(self, item: T) -> Self::Spanned<T> { Spanned(item, self) }
}

Then a .spanned() combinator could be introduced that adds performs this spanning operation automatically.

How do you both feel about this?

TheOnlyTails commented 1 month ago

I really like this!

MeGaGiGaGon commented 2 weeks ago

I third this, trying to migrate from pom where Parser could be easily extended to add a spanned combinator, not having it sucks majorly.