trezor / trezor-firmware

:lock: Trezor Firmware Monorepo
https://trezor.io
Other
1.35k stars 655 forks source link

Shorten identifiers to save space #4059

Open onvej-sl opened 2 months ago

onvej-sl commented 2 months ago

This is an idea on how to reduce the size of the core firmware.

I noticed that the firmware binary contains the names of all identifiers used in the python code. This allows access to variables using attributes (for example globals()[variable_name]). Since we enforce static typing wherever possible, I believe we don't need this feature. Consequently, we could use a preprocessor to rename all identifiers to shorter names, thereby saving space.

I discovered that the python standard library includes the ast package, which allows you to:

See this code.

The only drawback of this approach is that the package can only handle a single python file at a time. If you have a python project with multiple files, you must parse each file separately, determine which imports are internal and which are external, and rename the identifiers consistently.

Based on the number of identifiers in the binary and the space they occupy, my estimate is that this could save a few dozen kilobytes.

matejcik commented 2 months ago

fwiw we need libcst to do this ("concrete" syntax tree, that is, retaining comments & whitespace) -- otherwise we get mismatches of line numbers between frozen and regular build, which will make it a PITA to debug

assuming that, we can certainly try this, although IMHO more stable than mangling the input source code is dropping the data directly from the qstr table. that might require some micropython hackery though