Specialize target and level arguments that can be statically determined

EFanZh commented 10 months ago

This PR intends to optimize binary size by avoiding passing target and level arguments if they can be determined statically.

I have rewritten the implementation of the log macros to parse each argument separately in order to map macro arguments to the underlying __private_api::log function arguments.

Here is the binary size result tested using https://github.com/EFanZh/log/tree/binary-size-test/cargo-binstall:

opt-level	lto	codegen-units	original size	optimized size	diff
3	off	1	15714240	15714240	0
3	off	16	18257920	18257920	0
3	thin	1	16803704	16791416	-12288
3	thin	16	19269592	19257304	-12288
3	fat	1	15415128	15402840	-12288
3	fat	16	16418616	16398136	-20480
"z"	off	1	11921344	11917248	-4096
"z"	off	16	13867008	13871104	4096
"z"	thin	1	12806008	12801912	-4096
"z"	thin	16	14108632	14104536	-4096
"z"	fat	1	10385240	10381144	-4096
"z"	fat	16	10893144	10889048	-4096

Here is the runtime performance benchmark result tested using https://github.com/EFanZh/log/tree/benchmark.

original  => cycles: 1110, instructions: 70, wall time: 310ns.
optimized => cycles: 1126, instructions: 70, wall time: 280ns.

Note that I have added #[inline(never)] to the log function to prevent the compiler from calling log_impl directly which could affect the optimization. Without #[inline(never)], the binary size change will be:

opt-level	lto	codegen-units	original size	optimized size	diff
3	off	1	15714240	15714240	0
3	off	16	18257920	18257920	0
3	thin	1	16803704	16803704	0
3	thin	16	19269592	19261400	-8192
3	fat	1	15415128	15419224	4096
3	fat	16	16418616	16418616	0
"z"	off	1	11921344	11921344	0
"z"	off	16	13867008	13871104	4096
"z"	thin	1	12806008	12801912	-4096
"z"	thin	16	14108632	14104536	-4096
"z"	fat	1	10385240	10385240	0
"z"	fat	16	10893144	10889048	-4096

Personally, I prefer keeping #[inline(never)], since it has better binary size, and no significant performance cost.

EFanZh commented 9 months ago

Hi @Thomasdezeeuw, this PR is ready for review, would you like to take a look at it?

Thomasdezeeuw commented 9 months ago

I'm not sure this is worth it to be honest. The macros are already complex and this makes it even worse.

I'm also not the biggest fan of the introduced traits. I understand the thinking behind it, but again it adds more complexity.

Looking at the binary size reductions, they're not that great, less than 0.1% in most cases if I'm reading the tables correctly. Furthermore a lot of the size reduction seem to come from adding #[inline(never)]. What are the binary size differences when only adding that?

KodrAus commented 3 months ago

Just coming in through some triage. It looks like we're not likely to follow this one through and the conflicts may be tricky to resolve given how things have changed in the meantime.

Thanks again for all this investigatory work @EFanZh! Please feel free to open fresh PRs with any other optimizations you find.

rust-lang / log

Specialize target and level arguments that can be statically determined #603