TG9541 / stm8ef

STM8 eForth - a user friendly Forth for simple µCs with docs
https://github.com/TG9541/stm8ef/wiki
Other
315 stars 66 forks source link

Faster, leaner "pictured number output" #433

Closed TG9541 closed 3 years ago

TG9541 commented 3 years ago

While working on optional words for Forth Standard compatibility it became clear that while Forth Standard compliant "pictured number output" with # ( ud -- ud) instead of # ( u -- u) (double instead of single math) would increase the code size only marginally but the math would make printing numbers in a background process slower. This might break applications that print numbers in a background task as the limit of 1ms task run-time is exceeded (unless a fast 32bit/8bit division or buffered I/O is used).

While exploring options it also became clear that the current # can be made faster by using the DIV X,A (16bit / 8bit) instruction, and leaner by in-lining the code of DIGIT and EXTRACT, eForth words which other 16bit Forth implementations don't use (e.g. the well known F83) and that don't appear in the Forth Standard.

Forth Standard compatibility with double number output (e.g. D.) can be provided later through library words.

TG9541 commented 3 years ago

I did some testing with PulseView and the following word .. that toggles a GPIO with PLo and PHi :

: .. ( u -- u ) PLo <# PHi #S PLo #> PHi TYPE ;

I get the following timing for DECIMAL 65535 ..:

image

The following table shows that # and #S are much faster compared to the old version:

.. Base <# #S #> old [µs] <# #S #> improved [µs]
65535 10 155 31
6 10 53 22
65535 16 131 29
65535 2 446 60

The toggles around <# and #> revealed that about 4µs can be saved by coding the 16bit <literal> + in PAD in assembler (13µs to 9µs - the numbers in the table contain this optimization). In the BG task PAD is slightly faster as it returns a constant address.