theandrew168 / derzforth

Bare-metal Forth implementation for RISC-V
MIT License
42 stars 5 forks source link

Consider shrinking the word size for dict entries #3

Closed theandrew168 closed 3 years ago

theandrew168 commented 3 years ago

Currently, dictionary entries have their entire name spelled out and aligned to a 4-byte boundary. While this is no big deal for words with small names, it starts to really consume Flash / RAM as words get larger. For example, the word name alone for GPIO_CTL_OUT_ALT_OPEN_DRAIN occupies 28 bytes!

There are already strategies in the Forth world to mitigate this issue. Some use a standard 4-byte header to detail the word's length (1 byte) and then the first 3 characters of the name (3 bytes). This method is detailed in the book Threaded Interpretive Languages. This would lead to a high number of collisions for similarly-prefixed words, however.

I wonder if a simple 32-bit string hash would be better fit here? Maybe something like the basic hash function described in TPOP? It'd be fairly easy to implement in assembly, too. I'll do some basic testing to see what sort of collision issues I'd have for the words I currently define to utilize the RCU and GPIO.

theandrew168 commented 3 years ago

Wrote a quick script to check if any existing words would collide. Nope! I also calculated that switching would currently save 696 bytes.