python-greenlet / greenlet

Lightweight in-process concurrent programming
Other
1.65k stars 248 forks source link

Inlining assembly into slp_switch is fragile and compiler implementation-dependent #66

Open basak opened 10 years ago

basak commented 10 years ago

The underlying issues leading to pull requests #64 and #65 suggest to me that slp_switch has potential to be compiled buggy when undefined compiler behaviour changes. Specifically:

It's also particularly awkward to debug a problem in this code, since it heavily depends on what gcc generated and this is more difficult to read.

It seems to me that all of these issues would go away if slp_switch were implemented in a function defined entirely in assembly.

I understand that the SLP_SAVE_STATE and SLP_RESTORE_STATE macros would no longer be possible so changes to these macros would instead have to be done individually for every single platform. But I think this is a better option than having them end up generating bad code, which is what has happened in this case because the current mechanism depends on undefined behaviour. Looking at the git history, it looks like one macro only had to be changed once.

I'll leave it up to you. Feel free to close as "Won't Fix" - in this case I guess we'll just have to act when we see test failures.

(speaking of which, thank you for the tests, which picked up on these issues, and give me confidence that my workarounds work)

snaury commented 10 years ago

Yes, I'm well aware of the issue, and full rewrite in assembly in the only proper solution. Unfortunately it's not that easy, one would need to call helper functions to do actual save/restore. I could do it for x86, then maybe for amd64 (harder, it needs to maintain stack alignment guarantees, which are impossible to know within __asm__ blocks), but for other platforms this would be simply out of the question, not by me at least.

I don't have enough interest to do this, and in my opinion Go, Rust and proper coroutines are the future, not flimsy hacks like greenlet. Even if greenlet did switching perfectly, saving and restoring (or rather overwriting) stacks the way greenlet currently does is unsafe, which breaks gc, breaks threaded C++ code (global pointers to stack allocated objects become unsafe). And I'm sure no one would be thrilled if I moved to proper switching code (e.g. from Boost.Context), used separate stacks (would severely limit 32-bit architectures, e.g. no more than ~2000 greenlets), and dropped lots of architectures in the process. So until someone has enough itch to do the job we are left to working around smarter and smarter compilers.

Btw, pypy has a much better solution to this problem (but for a much smaller set of platforms), single asm block with pointers to save/restore functions. But it's not ideal either due to possibly incomplete clobber lists (callee saved registers are tricky) or potential stack alignment issues (which you can't maintain in __asm__ blocks, since you don't have any guarantees in there). Works in practice though, even has a cross-stacklet stack introspection, which is awesome!

Not closing since the issue is true, and is awaiting its heroes.