gbdev / rgbds

Rednex Game Boy Development System - An assembly toolchain for the Nintendo Game Boy and Game Boy Color
https://rgbds.gbdev.io
MIT License
1.34k stars 172 forks source link

Allow lazy evaluation of some expressions #663

Open Rangi42 opened 3 years ago

Rangi42 commented 3 years ago

This is an internal requirement for features including:

Rangi42 commented 3 years ago

I don't know if constructing a whole RPN tree would be necessary for these, at least not for short-circuiting logic operators. Instead, consider an isEvaluating flag for the parser. When given false && X, as soon as it reaches the && is sets isEvaluating = false for the duration of parsing X; likewise for true || X, and for both cases of ?:. This flag would disable symbol lookup and actual math and just keep evaluating a trivial 0 value, so syntax errors would still occur but not evaluation errors.

User-defined functions are more complicated but might also be doable without RPN trees. Their body would be captured like a macro body when they're defined. Then when they're called, the parser would expand the function body similar to a macro body, but instead of substituting the parameters \1, \2, etc at lex time, it would evaluate the parameters like any other symbols at parse time. This would mean keeping a context of which function it's in so that the parameter names map to the right argument values (symbol "layering" as mentioned in #662).

ISSOtm commented 3 years ago

The proposed solution for user-defined functions would imply re-parsing them, which is slow, and can be avoided since the expression tree won't change. This is why I'm pushing so hard to decouple expression evaluation from parsing.

Evaluation can still be eager for purely numeric expressions (1 + 1 will never change), and yield a "bad" RPN expression (say, pointing to NULL, so it doesn't require allocating a 1-byte buffer) for e.g. 1 / 0; that expression could be discarded due to short-circuiting, or turned into an error when trying to get a number out of it.

I believe this is necessary for a proper user-defined function implementation, notably to avoid technical debt, and since I think it also encompasses the other usages, I think we shouldn't use "partial" impls. This is why I've been reluctant to merge #665.

ISSOtm commented 3 years ago

Here are my thoughts on implementing this:

This way, db 1 || 2 / 0 would first generate the RPN buffer RPN_CONST 1 RPN_CONST 2 RPN_CONST 0 DIV, which RGBASM would collapse to RPN_CONST 1 RPN_DIV_BY_0, and seeing the ||, RGBASM would see the first constant, and ignore the RHS. (Note that RGBASM would actually internally store the expressions differently due to their "const" status, but the principle stays the same.) Assuming 1 is an unknown (wNonZeroLabel), RGBLINK would basically perform the same, though instead having to skip the RHS in the buffer, but that should be doable by keeping track of pushes and pops in the RPN stack, without actually evaluating anything.

Am I missing anything there?

Rangi42 commented 3 years ago

To avoid reserving half of the IDs for errors, the defined error values could count down from $FF, and check for a minimum threshold value instead of the high bit.