jl_gc_safepoint() is expensive, as it needs to get the current stack from pthread keys. The cheap alternative is jl_gc_safepoint_(ptls). This can improve mutator time significantly for benchmarks that call allocation frequently from the runtime code (e.g. 20% improvement for objarray).
jl_gc_safepoint()
is expensive, as it needs to get the current stack from pthread keys. The cheap alternative isjl_gc_safepoint_(ptls)
. This can improve mutator time significantly for benchmarks that call allocation frequently from the runtime code (e.g. 20% improvement forobjarray
).