Closed tbodt closed 7 years ago
I notice you have v8 on your stack. As far as I know greenlet is fundamentally incompatible with v8 (or any C++ code that shares pointers to objects on the stack globally). The reason for this is that when greenlets switch the stack is overwritten back and forth and objects become temporarily invalid until you switch back to that greenlet, thus any access to shared objects becomes a hazard.
In this instance there might be another issue in that greenlet truly does overwrite the stack and it might not look much different from a buffer overflow to a sanitizer. You probably need to patch address sanitizer to teach it how to deal with stack switching in greenlet.
I know that greenlet is incompatible with V8.
This is not just about V8 though, it also happens when V8 is not involved.
The asan runtime includes functions that stack switchers can call to shut up these messages, you could probably use those.
Regarding V8 compatibility, is it theoretically possible to reimplement greenlet on top of libcoro? node-fibers uses that, and it works.
My big concerns are stack sizes, switch performance and platform compatibility. For me a big appeal of greenlet was that it doesn't use dedicated stacks, you only pay for the delta of used stack space which makes it possible to create millions of greenlets and they cost very little. Next, swapcontext is very slow and needs a lot of space for all that state, but dedicated stacks mean you need to worry about their sizes. If the size is too small (Coro on perl uses something like 64-128kb, which is a lot btw) you risk running out and crashing, if the size is too big you'd have to use mmap for efficiency and then you run other risks, like running out of the maximum number of mmaps per process.
Personally I was thinking about rewriting greenlet in cython and using boost::context for quite a while, however such a library wouldn't be called greenlet (maybe greenlet-context or whatever, since the number of supported platforms would be much smaller than what greenlet currently supports and what people surprisingly actually use), and I'm not interested in it that much. If you have spare time you may try going that route, however I wouldn't accept such a solution into the mainline greenlet, as it would have entirely different set of drawbacks than what people currently expect.
The asan runtime includes functions that stack switchers can call to shut up these messages, you could probably use those.
I'm not sure those apply, since greenlet doesn't actually switch stacks, it reuses system stack by overwriting it with saved deltas. Currently it doesn't even know where the bottom or what the size of the stack is, so it doesn't look like those callbacks apply to what greenlet does.
libcoro has a variety of different implementations:
You can choose at compile time. So I don't think platform support is a problem.
About performance:
Stack space is definitely still an issue. FWIW libcoro includes stack allocation code that uses mmap if it's supported and malloc otherwise.
I'd like to make a libcoro-based version of greenlet, but the problem is getting libraries such as gevent to use it. My choices are to either make the fork appear in sys.modules under the name greenlet (which I really don't want to do), or to get something merged into greenlet. If I could pull off something that lets you choose if you want to use libcoro on a per-coroutine basis, could that be merged, or would it be too disgusting?
And you're right, the asan functions don't really apply to greenlet. They would apply to libcoro though.
It's not about disgust, if you make a fork I would probably gladly start using it myself, and if you convince gevent and eventlet to use it, then even better! :) It would work better, would have less opportunities for crashing and would probably be a lot faster too.
But as a maintainer I feel my primary responsibility is not breaking code for people that already use greenlet for whatever scenario. And scenarios for using greenlet are different for different people too, just 4-5 years ago I had something like 100k greenlets sloshing around in a single process and I have serious doubts any mmap based solution would have worked for me back then. Now that Go is more mature I would have preferred to use Go, and wouldn't want to do anything like that in Python ever again, so I personally wouldn't be affected. However I'm sure there's someone out there who's doing something crazy like that right now and mmap based solution would stop them from succeeding.
Making a patch that makes the new mode optional (behind something like greenlet.enable_awesome_mode()
) would probably work, but it would add a lot of complexity to an already complex code, and for what? People won't even know they need to use it, and I have no way of checking all 18+ platforms (and many more variants) for breakage. New library and a clean break is a lot better, at least that's how I would have done it if I was still interested enough. There's way too much complexity in greenlet right now that has to deal with unexpected gc and unexpected switching, with libcoro you'd have to deal with a lot less surprises like that, since stack cannot get invalidated right under you. It would be a lot faster too (no malloc and copy on every switch), but I'm not the one you need to convince.
I think if you make a new library and it's a lot faster (and trust me it will be, 6500ns on benchmarks/chain.py shouldn't be hard to beat), then people would recognize safety and a different set of tradeoffs are worth it. If you make using it as easy as from something import greenlet
, then porting and trying it out should be easy. You shouldn't want to integrate it into the current mess of greenlet.
Sounds like a fun weekend project. 😄
Make greenlets great again!
Sounds like a fun weekend project. Make greenlets great again!
@tbold, just curious, did anything happened on your side on this topic?
Thanks beforehand for feedback, Kirill
I made this: http://github.com/tbodt/greenstack. Try not to use it in production.
Here's the output:
Probably the most important line from that: