unikraft / lib-musl

musl: A C standard library
Other
8 stars 27 forks source link

locale: Add option to disable character maps #40

Closed dinhngtu closed 1 year ago

dinhngtu commented 1 year ago

Prerequisite checklist

Additional configuration

Description of changes

Musl's iconv includes lots of character sets by default. Provide an option to disable most of them, namely non-Unicode CJK charsets and legacy codepages except "latin1".

Disabling the option saves ~150K of uncompressed binary size.

marcrittinghaus commented 1 year ago

Hey @dinhngtu,

Thanks for this! PRs reducing image size and increasing efficiency are very much appreciated 😃

I tested the PR with nginx and disabled the legacy locales via your new option. I used a performance build + dead code elimination (DCE) + link time optimization (LTO). The binary size remained the same, though 😢 Could it be that any of the optimizations already kick in even without the modification? Or maybe I did something wrong!?

What was your test setup where you measured the 150K size reduction? Is there a quick way to check based on the symbols/final image if the charsets have been dropped?

dinhngtu commented 1 year ago

Hi,

DCE will remove the iconv functions since Nginx does not make use of it, therefore the binary size gain will not be visible. For a program that uses iconv (see example), binary sizes are as follows (with both DCE and LTO enabled):

-rwxr-xr-x 1 tu tu 327128 Apr  6 12:00 build-befor/app-makeworld_kvm-x86_64
-rw-r--r-- 1 tu tu 200538 Apr  6 12:00 build-befor/app-makeworld_kvm-x86_64.gz
-rwxr-xr-x 1 tu tu 171480 Apr  6 11:59 build-after/app-makeworld_kvm-x86_64
-rw-r--r-- 1 tu tu  67255 Apr  6 11:59 build-after/app-makeworld_kvm-x86_64.gz

Similar space gains can be realized for a simple hello-world without having to enable DCE/LTO (since they have their own downsides):

-rwxr-xr-x 1 tu tu 872752 Apr  4 20:28 build-full/app-makeworld_kvm-x86_64
-rw-r--r-- 1 tu tu 453633 Apr  4 20:28 build-full/app-makeworld_kvm-x86_64.gz
-rwxr-xr-x 1 tu tu 721200 Apr  4 20:35 build-slim/app-makeworld_kvm-x86_64
-rw-r--r-- 1 tu tu 320142 Apr  4 20:35 build-slim/app-makeworld_kvm-x86_64.gz

A simple way to check for the presence of legacy charsets (i.e. CONFIG_LIBMUSL_LOCALE_LEGACY=y) is to look for the charset table symbols: jis0208/gb18030/big5.

marcrittinghaus commented 1 year ago

Ok, great! Thanks 😃 Too bad that the nginx image won't benefit from this (I currently could use a size reduction here), but for all other cases where iconv is used a very good addition 😎