crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.44k stars 1.62k forks source link

Stack size of non-main threads on POSIX #15045

Open HertzDevil opened 4 weeks ago

HertzDevil commented 4 weeks ago

On Unix-like systems, the size of a stack depends on where the corresponding fiber is:

The last one varies greatly across platforms, down to just 128 KiB on Alpine Linux. Normally, this isn't a concern for the Crystal runtime, since under -Dpreview_mt all the extra main fibers are just the scheduler's worker loops, which are not expected to arbitrarily grow their stacks. However, this would still be relevant if one tries to do Thread.new directly, so we might want to standardize this stack size as well.

Windows doesn't make a distinction between main and extra threads if we pass 0 as the stack size when spawning a new thread with GC.beginthreadex, since they both pick up the value of the /STACK link flag.

straight-shoota commented 1 week ago

It's worth noting that Thread.new is currently not a public API. There should eventually be a public API to spawn a thread explicitly. With https://github.com/crystal-lang/rfcs/pull/2 this is implemented as ExecutionContext::Isolated: A context that starts a single thread to run one exclusive fiber. So user code still runs in a fiber and the thread stack will be unused except for starting the fiber. We could perhaps consider using the thread stack as the fiber stack to avoid an additional allocation, but that would be an implementation detail (and probably not very clever).

I think that means that all spawned threads can be initialized with rather minimal stack size. On Linux, the minimum value is PTHREAD_STACK_MIN (16kB).

ysbaddaden commented 1 week ago

Indeed, the current ExecutionContext::Isolated doesn't create another fiber, and just uses the thread's stack. It would be pointless to create another fiber.

Maybe we should be able to pass an explicit stack when creating a thread, just for that specific case... or maybe to be able to specify very small stacks for scheduler threads since their run loop don't need more than a few KB at worst.