fpjohnston / TECO-64

Enhanced and portable version of TECO text editor in C.
24 stars 5 forks source link

Arithmetic expression bug #28

Open LdBeth opened 1 month ago

LdBeth commented 1 month ago
*iZZZZZ`j``
*0A=``
90
*(0A*2)=``
180
*(0A+(0A))=``
180
*(0A==(0A))=``
0
*(0A==90)=``
0
*((0A)==(1A))=``
0

The last three expressions should give -1 (true)

this problem is specifically with variable in the formnA as other variables such as (Z==Z) does give expected results.

rhaberkorn commented 1 month ago

Is there a == operator in TECO64???

LdBeth commented 1 month ago

@rhaberkorn Yes, these are extended operators and by default enabled, also they have operator precedence and that is also enabled by default https://github.com/fpjohnston/TECO-64/blob/master/doc/oper.md

although the expression parser seem hand written and complicated so I don’t feel like to debug into it for now.

rhaberkorn commented 1 month ago

Yes, these are extended operators and by default enabled, also they have operator precedence and that is also enabled by default https://github.com/fpjohnston/TECO-64/blob/master/doc/oper.md

Interesting, although of little help unless you also have short-cut AND/OR operators as a-b"x can do most of this as well. btw. SciTECO also has modulo, but uses ^/. ^* is power accordingly. The choice of << and >> is dubious IMHO. How can the parser know for sure this is not two nested loops - the inner one with a custom termination condition? Sure, you could write those as < <, but that will break compatibility, that TECO64 is certainly striving for. == would also be tricky to adopt into SciTECO since it is currently interpreted as two prints - SciTECO is stack-oriented unlike more classical TECOs. And see above, what does a==b actually give you compared to a-b"=?

rhaberkorn commented 1 month ago

TECO64 also introduced operator precedence and associativity (just like SciTECO). But this will also break compatibility with TECOC. Just saying...

LdBeth commented 1 month ago

although of little help unless you also have short-cut AND/OR operators as a-b"x can do most of this as well.

I found having == helpful, as the comparison result can be combined using & and #, as for using a-b for equal comparison the result can not be easily combined.

In the squ.tes macro, there is a very long dispatch table:

Q0-^^B"E                                ! if it's EB                    !
        @O!$$!                          !   goto $$                     !
'                                       ! endif                         !
Q0-^^G"E                                ! if it's EG                    !
        @O!$$!                          !   goto $$                     !
'                                       ! endif                         !
Q0-^^I"E                                ! if it's EI                    !
        @O!$$!                          !   goto $$                     !
'                                       ! endif                         !
Q0-^^L"E                                ! if it's EL                    !
        @O!$$!                          !   goto $$                     !
'                                       ! endif                         !
Q0-^^N"E                                ! if it's EN                    !
        @O!$$!                          !   goto $$                     !
'                                       ! endif                         !
Q0-^^Q"E                                ! if it's EQ                    !
        @O!E1!                          !   goto E1                     !
'                                       ! endif                         !
Q0-^^R"E                                ! if it's ER                    !
        @O!$$!                          !   goto $$                     !
...

To make it runs faster, I rewrote it as

(Q0-^^B)*(Q0-^^G)*(Q0-^^I)*(Q0-^^L)*(Q0-^^N)*(Q0-^^R)*(Q0-^^W)*
(Q0-^^Z)*(Q0-^^_) "E    !! this is safe from overflow
        @O!$$!          !! even when integer is 32 bit
'

For this particular combination, I did used a SMT solver to prove this does not give unintended behavior from integer overflow of multiplication (assume Q0 is an 8 bit integer).

And the speed up of squish macro after this rewrite is about 20% running on TECO-64 (the comparison is done for both version squished). The speed up is not only from saved "E calls, but also smaller code size which makes go to runs faster.

And after I found the use of ==, I no longer have to worry about the integer overflow.

(Q0==^^B#Q0==^^G#Q0==^^I#Q0==^^L#Q0==^^N#
Q0==^^R#Q0==^^W#Q0==^^Z#Q0==^^_) "T
        @O!$$!
'

They do not improve the expressiveness of the language by a lot, but does useful for a programmer who does micro optimization.

The choice of << and >> is dubious IMHO. How can the parser know for sure this is not two nested loops - the inner one with a custom termination condition?

The extended operators is only recognized inside () when the implementation switches the parser to handler them, outside parentheses == still means oct based print, same applies to !, which would be treated as comment if outside (). Well, I have to extend the squish macro to handle ! inside ().

will break compatibility, that TECO64 is certainly striving for.

Several flags are provided to control the incompatible features including the extended operators, but @fpjohnston decides to have the incompatible features enabled instead of off by default, maybe to encourage users to try out the new features.

== would also be tricky to adopt into SciTECO since it is currently interpreted as two prints

I guess that requires the parser to do look ahead. If you provide other means to logically combine multiple comparison results then this is also not needed.

rhaberkorn commented 1 month ago

I guess that requires the parser to do look ahead. If you provide other means to logically combine multiple comparison results then this is also not needed.

Yes, some kind of lookahead would be required. But this means that an interactively typed = would no longer immediately print anything, which is certainly not what is expected in an interactive editor like SciTECO. At least not without further hacks like undoing the stack pop if a second = is read. Well, I don't know. I would prefer ~= I guess. It's certainly something to consider. I put it on my TODO list. But having a 2nd parser mode activated by braces I also do not like. You are supposed to be able to mix TECO code with arithmetic calculations. I use that a lot.

You can of course now already write something like (Q0-^^B"=-'1) # (Q0-^^G"=-'1)"T ... ' in SciTECO.

rhaberkorn commented 1 month ago

Another thing to consider in language design: Having flags influencing parser behavior immediately makes your language unparsable (because of the halting theorem) for any purpose like reliable syntax highlighting or even control flow.