LLVM SanitizerCoverage support?

LebedevRI commented 1 year ago

SanitizerCoverage is a middle-end LLVM instrumentation pass that "inserts calls to user-defined functions on function-, basic-block-, and edge- levels. Default implementations of those callbacks are provided <...>"

Would it be considered acceptable to add (at least a) a front-end switch (julia --sanitizer-coverage?) to control insertion of said pass into the LLVM pass pipeline (close to where sanitizer passes are handled)?

The long story is, i have a C++ codebase, and in my expirience, having more than one implementation is paramount to weeding-out various issues, so i'm somewhat interested in having a second implementation of said codebase. But just having a second implementation isn't sufficient, the key is to be able to compare their externally-observable side-effects, and fuzzing is rather invaluable there, and that strongly suggests AOT compilation and guided fuzzing, and thus coverage is needed (thus, subj).

vchuravy commented 1 year ago

Yes I think that would be a welcomed addition. I long wanted to have support for tsan in a similar addition.

We will have to talk about how this interacts with cached object files / multi-versioning.

cc: @pchintalapudi

LebedevRI commented 1 year ago

Great to hear!

I'm going to take a look then...

LebedevRI commented 1 year ago

Two observations so far:

Assertion in RTDyldMemoryManagerJL::allocateCodeSection
```
// allocating more than one code section can confuse libunwind.
assert(!code_allocated);
```
(https://github.com/JuliaLang/julia/commit/8533a1c40a62e2ae874ae4e3eb6719af24aa34e8#diff-8186bd96ba9aaa52f867b0e1d5e203800b1d15c4f2c2d8b332bb120744b7da85R761-R762) fails with sancov too. While i have workarounded that by disabling it when sancov is enabled, it raises the question, is the comment still true/relevant?
I haven't solved this yet, but looks like a stub hooks will need to be provided (effectively, https://godbolt.org/z/78ExvTzsY), otherwise even init_f16_funcs() crashes. The interesting question being their linkage, since we'd want for the real hooks to override the stubs.

pchintalapudi commented 1 year ago

1 should be solved by turning on JITLink (in src/jitlayers.h), 2 may need 1 or 2 new JITDylib linked in the correct order to allow overriding.

LebedevRI commented 1 year ago

1 should be solved by turning on JITLink (in src/jitlayers.h)

Looking at that file, i see the point. But the problem is, it's not known at the time the julia itself is compiled whether or not sancov is enabled, it's a run-time setting, so that would effectively require the julia to completely migrate to JITLink.

2 may need 1 or 2 new JITDylib linked in the correct order to allow overriding.

Yeah, that's the rough plan i guess.

pchintalapudi commented 1 year ago

I think if you want a sancov-capable build there should be a flag that turns on JITLink, but then also does the runtime check for the sancov flag. Turning on JITLink on your chosen platform shouldn't harm the JIT (and if it does, we'd like to hear about it) unless the platform isn't supported by JITLink.

LebedevRI commented 1 year ago

I think if you want a sancov-capable build there should be a flag that turns on JITLink

... which would be default-on.

LebedevRI commented 1 year ago

Ok, with JL_USE_JITLINK, stubs seem to work, and (non-fuzzing) julia --sanitizer-coverage is usable, but there's now this weird JIT session error: Duplicate section issue., i'm guessing pass-generated instrumentation functions happen to end up in different section?

output

``` $ /builddirs/julia-dev/julia -Cnative -Jusr/lib/julia/sys-debug.so --depwarn=error --check-bounds=yes -g1 --startup-file=no --startup-file=no --color=no --sanitizer-coverage JIT session error: Duplicate section _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.10.0-DEV.1432 (2023-06-04) _/ |\__'_|_|_|\__'_| | llvm-sanitizer-coverage/d3546b4d78 (fork: 1 commits, 1 day) |__/ | julia> exit()JIT session error: Duplicate section JIT session error: Duplicate section julia> Bool(Base.JLOptions().sanitizer_coverage)JIT session error: Duplicate section julia> Bool(Base.JLOptions().sanitizer_coverage) JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section true julia> @code_llvm Bool(Base.JLOptions().sanitizer_coverage) JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section JIT session error: Duplicate section ; @ float.jl:171 within `Bool` ; Function Attrs: sspstrong define i8 @julia_Bool_136(i8 signext %0) #0 comdat { top: %gcframe5 = alloca [5 x {}*], align 16 %1 = load i8, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 0), align 1 %2 = add i8 %1, 1 store i8 %2, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 0), align 1 %gcframe5.sub = getelementptr inbounds [5 x {}*], [5 x {}*]* %gcframe5, i64 0, i64 0 %3 = bitcast [5 x {}*]* %gcframe5 to i8* call void @llvm.memset.p0i8.i32(i8* noundef nonnull align 16 dereferenceable(40) %3, i8 0, i32 40, i1 false) %thread_ptr = call i8* asm "movq %fs:0, $0", "=r"() #11 %ppgcstack_i8 = getelementptr i8, i8* %thread_ptr, i64 -8 %ppgcstack = bitcast i8* %ppgcstack_i8 to {}**** %pgcstack = load {}***, {}**** %ppgcstack, align 8 %4 = bitcast [5 x {}*]* %gcframe5 to i64* store i64 12, i64* %4, align 16 %5 = load {}**, {}*** %pgcstack, align 8 %6 = getelementptr inbounds [5 x {}*], [5 x {}*]* %gcframe5, i64 0, i64 1 %7 = bitcast {}** %6 to {}*** store {}** %5, {}*** %7, align 8 %8 = bitcast {}*** %pgcstack to {}*** store {}** %gcframe5.sub, {}*** %8, align 8 call void @__sanitizer_cov_trace_const_cmp1(i8 2, i8 %0) %switch = icmp ult i8 %0, 2 br i1 %switch, label %common.ret, label %L9 common.ret: ; preds = %top ; @ float.jl within `Bool` %9 = load i8, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 1), align 1 %10 = add i8 %9, 1 store i8 %10, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 1), align 1 %11 = load {}*, {}** %6, align 8 %12 = bitcast {}*** %pgcstack to {}** store {}* %11, {}** %12, align 8 ; @ float.jl:171 within `Bool` ret i8 %0 L9: ; preds = %top ; @ float.jl within `Bool` %13 = load i8, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 2), align 1 %14 = add i8 %13, 1 store i8 %14, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__sancov_gen_, i64 0, i64 2), align 1 %15 = getelementptr inbounds [5 x {}*], [5 x {}*]* %gcframe5, i64 0, i64 2 %16 = bitcast {}** %15 to [3 x {}*]* ; @ float.jl:171 within `Bool` %17 = zext i8 %0 to i64 %18 = getelementptr inbounds [256 x {}*], [256 x {}*]* @jl_boxed_int8_cache, i64 0, i64 %17 %19 = load {}*, {}** %18, align 8 call void @j_InexactError_138([3 x {}*]* noalias nocapture noundef nonnull sret([3 x {}*]) %16, {}* inttoptr (i64 140455184970920 to {}*), {}* readonly inttoptr (i64 140455032792800 to {}*), {}* readonly %19) %ptls_field6 = getelementptr inbounds {}**, {}*** %pgcstack, i64 2 %20 = bitcast {}*** %ptls_field6 to i8** %ptls_load78 = load i8*, i8** %20, align 8 %21 = call noalias nonnull dereferenceable(32) {}* @ijl_gc_pool_alloc(i8* %ptls_load78, i32 1184, i32 32) #9 %22 = bitcast {}* %21 to i64* %23 = getelementptr inbounds i64, i64* %22, i64 -1 store atomic i64 140455014062912, i64* %23 unordered, align 8 %24 = bitcast {}* %21 to i8* %25 = bitcast {}** %15 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(24) %24, i8* noundef nonnull align 16 dereferenceable(24) %25, i64 24, i1 false) call void @ijl_throw({}* %21) unreachable } julia> ```

pchintalapudi commented 1 year ago

@code_llvm raw=true dump_module=true might provide additional helpful info here.

vchuravy commented 1 year ago

And you can use the environment variable JULIA_LLVM_ARGS to pass flags like you would to opt. IIRC JITLink has some options to get more information.

LebedevRI commented 1 year ago

(thank you, i'm aware of those tricks, that was more of a rhetorical question)

LebedevRI commented 1 year ago

FTR, here's the complete output of: JULIA_LLVM_ARGS="-print-before-all -debug" /builddirs/julia-dev/julia -Cnative -Jusr/lib/julia/sys-debug.so --depwarn=error --check-bounds=yes -g1 --startup-file=no --startup-file=no --color=no --sanitizer-coverage -E"Bool(Base.JLOptions().sanitizer_coverage)" &> log.txt: log.txt It is a bit unobvious which sections are the problem, but i guess it's

    12: Creating section for "__sancov_guards"
    13: Creating section for "__sancov_pcs"
    14: ".rela__sancov_pcs" is not an SHF_ALLOC section: No graph section will be created.
    15: Creating section for "__sancov_guards"
    16: Creating section for "__sancov_pcs"

LebedevRI commented 1 year ago

I wonder if we could just uniquify section names for all LLVM IR globals? Or is that generally undesired in Julia?

vchuravy commented 1 year ago

What do you mean by that?

Maybe @lhames as an idea.

LebedevRI commented 1 year ago

What do you mean by that?

Effectively, the same what -ffunction-sections does: https://godbolt.org/z/PrnYr38a9 )Just go through each global, and if it is in a section that we have already seen, change it's section to a new one.

JuliaLang / julia

LLVM SanitizerCoverage support? #50044