Open basak opened 10 years ago
Yes, I'm well aware of the issue, and full rewrite in assembly in the only proper solution. Unfortunately it's not that easy, one would need to call helper functions to do actual save/restore. I could do it for x86, then maybe for amd64 (harder, it needs to maintain stack alignment guarantees, which are impossible to know within __asm__
blocks), but for other platforms this would be simply out of the question, not by me at least.
I don't have enough interest to do this, and in my opinion Go, Rust and proper coroutines are the future, not flimsy hacks like greenlet. Even if greenlet did switching perfectly, saving and restoring (or rather overwriting) stacks the way greenlet currently does is unsafe, which breaks gc, breaks threaded C++ code (global pointers to stack allocated objects become unsafe). And I'm sure no one would be thrilled if I moved to proper switching code (e.g. from Boost.Context), used separate stacks (would severely limit 32-bit architectures, e.g. no more than ~2000 greenlets), and dropped lots of architectures in the process. So until someone has enough itch to do the job we are left to working around smarter and smarter compilers.
Btw, pypy has a much better solution to this problem (but for a much smaller set of platforms), single asm block with pointers to save/restore functions. But it's not ideal either due to possibly incomplete clobber lists (callee saved registers are tricky) or potential stack alignment issues (which you can't maintain in __asm__
blocks, since you don't have any guarantees in there). Works in practice though, even has a cross-stacklet stack introspection, which is awesome!
Not closing since the issue is true, and is awaiting its heroes.
The underlying issues leading to pull requests #64 and #65 suggest to me that
slp_switch
has potential to be compiled buggy when undefined compiler behaviour changes. Specifically:slp_switch
is not re-ordered around calls toslp_save_state
andslp_restore_state
.REGS_TO_SAVE
with registers that are clobbered only applies to that__asm__
statement itself. A compiler might, for example, fulfill the need to preserve these registers in each call just by pushing and popping them before and after. This breaks the assumption that these registers will be saved across theslp_save_state
andslp_restore_state
calls.slp_restore_state
runs in a "different" stack (right?). The compiler does not know this, and can generate code that assumes that registers are preserved within the function but over the call toslp_restore_state
.It's also particularly awkward to debug a problem in this code, since it heavily depends on what gcc generated and this is more difficult to read.
It seems to me that all of these issues would go away if
slp_switch
were implemented in a function defined entirely in assembly.I understand that the
SLP_SAVE_STATE
andSLP_RESTORE_STATE
macros would no longer be possible so changes to these macros would instead have to be done individually for every single platform. But I think this is a better option than having them end up generating bad code, which is what has happened in this case because the current mechanism depends on undefined behaviour. Looking at the git history, it looks like one macro only had to be changed once.I'll leave it up to you. Feel free to close as "Won't Fix" - in this case I guess we'll just have to act when we see test failures.
(speaking of which, thank you for the tests, which picked up on these issues, and give me confidence that my workarounds work)