kervinck / gigatron-rom

System, apps and tooling for the Gigatron TTL microcomputer
BSD 2-Clause "Simplified" License
229 stars 81 forks source link

lcc: overflow of relational operators #64

Open kervinck opened 5 years ago

kervinck commented 5 years ago

The result of SUBW doesn't indicate if an overflow has occurred. Consequently, vCPU offers only signed comparisons against 0 for branching decisions. With that, the relational operators a<b, a<=b, a a>=b a>b only work when the difference between a and b fits in 15 bits. Ultimately this idiosyncrasy stems from the lack of status register in the 8-bit hardware architecture, and the underlying design idea that software can always compensate for missing hardware features...

TinyBASIC_v2.gt1 uses the following sequence to get correct comparisons over the full range:

0461  ee 02                    LDLW  $02                |..|
0463  fc 3a                    XORW  $3a                |.:|
0465  35 53 6a                 BGE   $046c              |5Sj|
0468  ee 02                    LDLW  $02                |..|
046a  90 6e                    BRA   $0470              |.n|
046c  ee 02                    LDLW  $02                |..|
046e  b8 3a                    SUBW  $3a                |.:|
0470  35 56 73                 BLE   $0475              |5Vs|

So prior to the standard subtract, it first checks if the operand highest bits are equal. If equal, it perform the normal SUBW for comparison. If not equal, the first operand is loaded for BLE/BLT (or the second operand in the case of BGE/BGT). This sequence costs 8 vCPU instructions or 18 bytes here, compared to 3/7 for the naive sequence. This is the reason that Tiny BASIC uses this idiom selectively.

For a compliant C compiler we need something similar. For example, now 0xffffu compares as smaller than 0. And we also can't really say that ints are just 15-bits wide instead.

My plan:

  1. First bite the bullet and implement it correctly for < <= > and >=. Try hard if we can get away with at most one new helper function in rt.py ("@prepcmp"?). For example, implement only in one order and reverse the order of operands if necessary.
  2. Provide macros/instrinsics/builtins to give access to the the naive relational operators when so desired by the programmer. Something like LT(a,b), GT(a,b), LE(a,b) and GE(a,b).
  3. Optimise away the prelude when comparing against a zero constant.
  4. Think deep about further optimisation options. For example, try to deduce the high bit of both sides by static analysis...?
  5. Consider some trivial rewrites. Eg. comparisons of an unsigned integer against 0 can be always be replaced by one of true/false/NE/EQ.
Cwiiis commented 5 years ago

Re 4, perhaps you could add a compiler built-in that users could mark variables they don't expect full 16-bit comparisons on? That could then aid with the static analysis.

[Edit MvK: removed quoted e-mail]

kervinck commented 5 years ago

Yes, this concept needs to evolve, hence the issue to track the thought process

My idea of prioritising things in LCC are:

  1. Correctness
  2. Usability (e.g. error messages, 64K support, essential library functions, ...)
  3. Optimisations ...
  4. Floating point :-)
kervinck commented 5 years ago

This has the potential to make a lot of code very inefficient, and I like to know the impact. For example:

if (putc(…) < 0)… can become nasty, whereas

if (putc(…) == EOF) … has potentially an efficient translation (using ADDI 1 + BNE)

kervinck commented 5 years ago

Perhaps we can squeeze in another new vCPU instruction that helps for these: "CMPW $DD".

kervinck commented 4 years ago

ROM v5 will have CMPHS and CMPHU instructions that are designed to solve this. See also https://github.com/kervinck/gigatron-rom/commit/f584a2aa986949fcb42cb4606005d74e1ebc395a

lb3361 commented 1 year ago

Suggesting to close this issue since glcc does this correctly already (on both roms v4 --with code-- and v5a+ --with cmphi/cmphs).