In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).
The stack frames above the first vmem_alloc were also fairly large.
This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.
The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.
Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.
some spl_vmem.c functions are given inline hints
These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.
remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time warnings
Motivation and Context
Description
How Has This Been Tested?
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Performance enhancement (non-breaking change which improves efficiency)
[ ] Code cleanup (non-breaking change which makes code smaller or more readable)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
[ ] Documentation (a change to man pages or other documentation)
In https://github.com/openzfsonosx/openzfs/issues/90 a user reported panics on an M1 with the message
"Invalid kernel stack pointer (probable overflow)."
In at least several of these a deep multi-arena allocation was in progress (several vmem_alloc/vmem_xalloc reaching all the way down through vmem_bucket_alloc, xnu_alloc_throttled, and ultimately to osif_malloc).
The stack frames above the first vmem_alloc were also fairly large.
This commit sets a dynamically sysctl-tunable threshold (8k default) for remaining stack size as reported by xnu. If we do not have more bytes than that when vmem_alloc() is called, then the actual allocation will be done in a separate worker thread which will start with a nearly empty stack that is much more likely to hold the various frames all the way through our code boundary with the kernel and beyond.
The xnu / mach thread_call API (osfmk/kern/thread_call.h) is used to avoid circular dependencies with taskq, and the mechanism is per-arena costing a quick stack-depth check per vmem_alloc() but allowing for wildly varying stack depths above the first vmem_alloc() call.
Vmem arenas now have two further kstats: the lowest amount of available stack space seen at a vmem_alloc() into it, and the number of times the allocation work has been done in a thread_call worker.
These are small functions with no or very few automatic variables that were good candidates for clang/llvm's inlining heuristics before we switched to building the kext with -finline-hint-functions.
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.