eurecom-s3 / symcc

SymCC: efficient compiler-based symbolic execution
http://www.s3.eurecom.fr/tools/symbolic_execution/symcc.html
GNU General Public License v3.0
773 stars 135 forks source link

Use musl to overcome the problem of libc wrapper incomplete #65

Closed tiedaoxiaotubie closed 3 years ago

tiedaoxiaotubie commented 3 years ago

Hi, I noticed previous issue https://github.com/eurecom-s3/symcc/issues/23 has mentioned we can try to use musl to replace some libc functions during the instrumentation. My question is: suppose we are instrumenting a large-scale problem, and we are not familiar with its building configuration, if I want to use musl to replace specific libc function in the target program (e.g., use the implementation of qsort to replace the qsort in libc), it there any convenient approach?

tiedaoxiaotubie commented 3 years ago

What if I first use symcc to instrument qsort, and then use LD_PRELOAD to replace all qsort with our instrumented qsort.so? However, the implementation of qsort is not self-contained, not sure whether it is doable.

yiyunliu commented 3 years ago

I believe if you use LD_PRELOAD for qsort then the symbolic runtime will also use the instrumented version of qsort. This will add unnecessary overhead.

I'm trying to use partial linking and objcopy --localize-symbols to address the problem you were describing. At the moment, I can link one executable against both glibc and musl. Here's my minimal setup:

You can fine-tune which functions you'd like to call from musl by modifying the localization list.

Unfortunately, I can't see an easy way to integrate this into a large build system (there's some extra work required for partial linking). It also segfaults when I replicate the same approach with actual symcc.

tiedaoxiaotubie commented 3 years ago

What about use #define strlen musl_strlen in the target program source code? Since we only modify the source code of the target program, it won't affect symcc. musl_strlen is an extern, it is a wapper of the strlen implementation in musl. In this way, the original strlen will be replaced by musl_strlen during the compiling.

yiyunliu commented 3 years ago

I think that would work. The macros can be defined in the headers of musl so the target program doesn't have to change. All we need is a version of musl with every exported function/symbol attached with the musl_ prefix. I really wish there's a more generic approach for avoiding conflicts. The instrumented version of libc++ works fine with libstdc++ because the c++ always renames the functions internally.