v923z / micropython-ulab

a numpy-like fast vector module for micropython, circuitpython, and their derivatives
https://micropython-ulab.readthedocs.io/en/latest
MIT License
416 stars 114 forks source link

ulab code size of certain functions grew a lot with gcc13, when not used with LTO #648

Open dhalbert opened 11 months ago

dhalbert commented 11 months ago

ulab commit: eacb0c9

ARM just released a gcc13 toolchain: https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads.

CircuitPython just converted the main branch of CircuitPython to use gcc 13. In general we see a decrease of code size with gcc13 compared with gcc 2. But for ulab compiled on a build without LTO, we see a substantial increase. See https://github.com/adafruit/circuitpython/pull/8546#issuecomment-1787472714. If you expand > diffs.txt in that comment, an scroll to the end, you'll see that the functions with the largest increase in byte size are:

...
152 numerical_sum_mean_std_ndarray
206 ndarray_binary_add
206 ndarray_binary_multiply
232 ndarray_inplace_ams
570 ndarray_binary_equality
700 compare_function
712 ndarray_get_slice
764 ndarray_binary_power
798 ndarray_binary_subtract
824 ndarray_binary_floor_divide
846 ndarray_binary_true_divide
1606 ndarray_binary_more

This increase does not show up on LTO builds: see the atmel-samd Metro M4 numbers in that comment. But on non-LTO builds, there is a substantial increase. I tested a couple of ports and see increases on stm and raspberrypi (RP2040).

I haven't looked at these functions to see what might be unusual about them yet.

v923z commented 11 months ago

Thanks for the info! I'll try to look into this in the near future.