junxzm1990 / x86-sok

132 stars 20 forks source link

Fake functions are wrongly handled #8

Open thaddywu opened 3 years ago

thaddywu commented 3 years ago

Hi, I manually check the ground truth in your test suite. But the ground truth of some "cold functions" got me confused.

Binaries may contain some "cold functions". They're some fake functions for improving code locality. See it in the gcc documentation.

The cold attribute is used to inform the compiler that a function is unlikely executed. The function is optimized for size rather than speed and on many targets it is placed into special subsection of the text section so all cold functions appears close together improving code locality of non-cold parts of program. The paths leading to call of cold functions within code are marked as unlikely by the branch prediction mechanism. It is thus useful to mark functions used to handle unlikely conditions, such as perror, as cold to improve optimization of hot functions that do call marked functions in rare occasions. When profile feedback is available, via -fprofile-use, hot functions are automatically detected and this attribute is ignored.

For your convenience, I would like to provide some examples from your test suite. In gcc_base.amd64-m64-gcc81-O2 (SPEC2006 gcc -O2), there're many cold functions annotated with suffix .cold.n like cpp_register_pragma.cold.5 (at 0x00401995), cleanup_cfg.cold.6 (at 0x004019db), insn_cuid.cold.10 (at 0x004019e5).

These cold functions can only be accessed by jump instruction. Therefore, they should be seen as basic blocks of functions which have instructions jumping to them. In another word, we should not regard their start addresses as function entries. Actually, instructions between 0x004019db-0x004019e5 constitute a basic block of cleanup_cfg. However, SOK regards cleanup_cfg.cold.6 as a real function. As a result, SOK got the wrong function entry 0x004019db here and missed basic code blocks of cleanup_cfg. It appeared that SOK correctly handled with many "cold functions". That's so amazing. But SOK still failed to cope with some other ones.

bin2415 commented 3 years ago

Hello, I agree that "cold functions" are not real functions. And most of these addresses are not marked as function entries by our toolchain as I know. In fact, most of these cold parts are arranged into non-continuous regions until linking stage, as we marked the function entries at compiling stage, so we do not tread the start of cold part as the function entry. The problem you mentioned is interesting, thanks for reporting!