Multiline functions: local variables

irevoire commented 3 months ago

Fixes https://github.com/sharkdp/numbat/issues/509 Fixes https://github.com/sharkdp/numbat/issues/201

The error reporting of the parser got a little bit worse. We can't detect trailing character at the end of a statement now and the error reported will all be ExpectedPrimary. In order to fix that we would either need some special error handling when the last statement parsed was a function. Or a different syntax.

Right now that's what I parse:

fn power_4(x: Scalar) = z
  where y = x * x
        z = y * y

Maybe we prefer something looking like that;

fn power_4(x: Scalar) = z
  where y = x * x
  where z = y * y

The issue with the current implementation is that after parsing a where I must continue to parse as many variable declaration as possible + all whitespace and newlines.

sharkdp commented 3 months ago

Thank you so much for working on this. I am very excited about this feature.

The issue with the current implementation is that after parsing a where I must continue to parse as many variable declaration as possible + all whitespace and newlines.

I think Haskell doesn't have this problem since they have a indentation-aware syntax, right? I'm hesitant, but would that be worth thinking about? For example, it might be reasonable to require the variable definitions to be indented more than the where keyword itself? Of course that would also mean that the rest of the language becomes somewhat indentation-aware. Because the statement following a function definition with a where-clause would need to have a lower indentation level than the where keyword.

Maybe that is the advantage of let … in …?

irevoire commented 3 months ago

For example, it might be reasonable to require the variable definitions to be indented more than the where keyword itself?

Oh gosh, please no no no this is literally the feature I despise the most over all existing languages 😭

It will impact the whole language, as you said
Make the code harder to write and maintain with dumb bugs over characters we don't see
Make a code formatter pretty much useless and harder to write

I’m 100% against it.

Anyway, I see two ways of fixing the where personally:

After parsing the where part of the function, we could push a \n as the latest one and let the rest of the code work (this is not supported currently, I believe)
After parsing the where, we could set a boolean somewhere saying that the next time the block of code looking for a trailing character is executed, it should allow all trailing character
While looking for trailing characters, we check if the latest stmt parsed was a function containing a where

None of them are really that ugly IMO, even if it’s not ideal.

I believe the let ... in would work without any hacks, yes, but it would also make it slightly harder to share code between the current let and the where (not a big deal either tbh)

sharkdp commented 3 months ago

Oh gosh, please no no no this is literally the feature I despise the most over all existing languages 😭

Haha, fine. I agree with you, which is why I was hesitant to even mention it. If we can keep the proposed where syntax without that, I'm all for it.

Anyway, I see two ways of fixing the where personally:

I can't really judge which of your proposed solutions would be the best, without further looking into it myself.

Is the problem "only" with the error reporting, or are there actual parsing ambiguities?

irevoire commented 3 months ago

Is the problem "only" with the error reporting, or are there actual parsing ambiguities?

As long as = is not allowed in an expression, I think there are no ambiguities. It's only an issue with the error reporting because we're currently expecting all statements to have newlines at the end before the next statement or expression. But with the where, we need to eat all the newlines, which breaks the parsing of statements that want to remove newlines.

sharkdp commented 3 months ago

Ok. I'm not 100% sure I understood the problem correctly, but please let me know if I can help move this forward :+1:

irevoire commented 3 months ago

Ok. I'm not 100% sure I understood the problem correctly, but please let me know if I can help move this forward 👍

Ok so I found a fourth solution. I didn't realize that numbat wasn't using a streaming parser and could come back in time easily. So, I now track the current token before running into an error so I can reset myself to the latest useful character.

If you're ok with the rest of the code I believe we can merge 😄

sharkdp commented 3 months ago

I plan to do a review later today hopefully. I added two commits (updating editor syntaxes, rewriting some Numbat code).

Doc generation seems to be broken at the moment. If I run book/build.py, the functions with a where clause seem to lose their @decorator information(?).

irevoire commented 3 months ago

Hey, FYI, I have a lot of friends at home currently. I probably won't work on this before next week

irevoire commented 3 months ago

Hey awesome, thanks a lot!

sharkdp / numbat

Multiline functions: local variables #519