Currently, invoking pgm_read_byte() is not always inlined, eg. in libc.a, there are 5 function bodies and 89 "CALL0" (sometimes in loop) to pgm_read_byte_inlined().
Generally, function calling brings negative effects in performance, eg. call/return instruction overhead, unwanted register-register moves in order to comply with the calling convention and many callee-clobbered registers that hinder efficient register allocation.
Currently, invoking
pgm_read_byte()
is not always inlined, eg. inlibc.a
, there are 5 function bodies and 89 "CALL0" (sometimes in loop) topgm_read_byte_inlined()
.Generally, function calling brings negative effects in performance, eg. call/return instruction overhead, unwanted register-register moves in order to comply with the calling convention and many callee-clobbered registers that hinder efficient register allocation.
In contrast, complete inlining brings furtuer optimization opportunites, eg. common expression elimination/sharing and loop-invariant expression hoisting.
Before & after in bytes of
.text
, inlibc.a
:lib_a-nano-svfprintf.o
lib_a-nano-vfprintf.o
lib_a-regcomp.o
lib_a-regexec.o
string_pgmspace.o