shirok / Gauche

Scheme Scripting Engine
https://practical-scheme.net/gauche
Other
816 stars 81 forks source link

Compiling with address sanitizer #638

Open lassik opened 4 years ago

lassik commented 4 years ago

Gauche's source code has some references to clang's address sanitizer. I think asan is awesome and have found it invaluable in finding bugs in my own C code. Can Gauche currently be compiled using asan, and are there some GC tricks that confuse it?

pclouds commented 4 years ago

ASAN_OPTIONS is mentioned in gc .travis.yaml and there's code in the GC to recognize address_sanitizer feature. I'd say just try it out. Probably will work out of the box.

Oh.. and I think GC is the only thing that can trip asan (Shiro can correct me here because I only touch a portion of Gauche)

pclouds commented 4 years ago

Just tried with CFLAGS=-fsanitize=address and ran make test. We may have a problem in count_size_and_length.

lassik commented 4 years ago

Thanks for helping out!

pclouds commented 4 years ago

@shirok CiSE generated code probalby confuses the address sanitizer (or worse, we're relying on undefined behavior). sys-strftime generates this code

{
    char tmpbuf[256];
    strftime(tmpbuf,sizeof(tmpbuf),format,&(SCM_SYS_TM_TM(tm)));
    {
        SCM_RESULT=(tmpbuf);
        goto SCM_STUB_RETURN;
    }
}
goto SCM_STUB_RETURN;
SCM_STUB_RETURN:
SCM_RETURN(SCM_MAKE_STR_COPYING(SCM_RESULT));

The problem here is tmpbuf is declared in a smaller scope, which ends when we goto to the outside label SCM_STUB_RETURN.

I think asan is technically correct here, that we can't know if tmpbuf remains the same after goto-ing outside of tmpbuf declaration code.

Anyway, by removing the {} around tmpbuf block, the error goes away and make test passes with asan. Which is really good news.

lassik commented 4 years ago

Awesome. They just made Chicken asan-compatible not long ago. It's great to add Gauche to the list. That thing may be the best thing to happen to C programming in a decade.

pclouds commented 4 years ago

Correction, make -C src test passes. Testing external modules will still fail with the same problem (CiSE generates lots of stuff). But since the core code passes asan, I'm quite confident we will not hit blocker issues. Just a bug here and there to iron out.

shirok commented 4 years ago

Thanks for finding this. A quick workaround would be to avoid autoboxing of return value and call SCM_MAKE_STR_COPYING explictly in the CiSE code, but this behavior is inherently dangerous.

I'm looking at the cgen. modules to fix this. Simply removing extra {} won't do, because `letcan be arbitrarily nested. I think I should generate the boxing code in place ofreturn`, rather than making jump and let the epilogue handle it.

shirok commented 4 years ago

It turned out it's not straightforward to change cgen to address this issue with keeping backward compatible to existing extension code. Maybe 1.0 is a good time to make a sweeping change, but I first need to update the code I'm maintaining to catch up the modern way.

Meanwhile, I fixed sys-strftime to avoid the noted issue.

pclouds commented 4 years ago

Thanks. sys-ctermid needs fxing too because to pass the test suite with address sanitizer (it's used in termios tests). And I think the potential problem is also in sys-gethostname and sys-getdomainname.

Asan also detects memory leaks (!) in bcrypt. Something I could look into. Building static binaries (in test/scripts.scm) also fails probably because -faddress=sanitize is missing, should also be easy to fix.

Apart from that the whole test suite passed.

shirok commented 4 years ago

Thanks, I merged the two PRs. I'll go over sys-* functions.