fubark / cyber

Fast and concurrent scripting.
https://cyberscript.dev
MIT License
1.18k stars 40 forks source link

"Some languages like ruby and php have not optimized for fibers and are thus left out." #20

Closed ioquatix closed 1 year ago

ioquatix commented 1 year ago

As the author of Ruby's fibers, I take exception to this statement. Did you actually test it?

fubark commented 1 year ago

Yes it was really slow, I will check again to make sure and post the snippet.

fubark commented 1 year ago

I remember what it was now, ruby creates fibers with large initial stack memory. If you know a way to reduce that I'd like to know. Here's the benchmark btw: https://github.com/fubark/cyber/blob/master/test/bench/fiber/fiber.rb

Edit: I updated the description on the site.

ioquatix commented 1 year ago

If I have time I'll take a look. There are environment variables you can use to reduce the stack size but I don't believe it will have a big impact - virtual memory already assumes that things are paged on demand.

I did compare the python generators with Ruby fibers and Ruby fibers were slower. What I don't know is whether it's a fair comparison. IMHO, if Python is not allocating a stack and/or context switching some how, it's not a fiber. Ruby's fibers are closer to green threads than coroutines. That being said, more power to Python if they can take advantage of that to get a performance improvement.

In your own language, do you always allocate a stack? What coroutine implementation are you using?

fubark commented 1 year ago

Yep, Python has stackless coroutines so they have an advantage over the others in the benchmark, but it still has the same mechanism of switching the context. In Cyber, it allocates a small stack to start. Since each function call knows how big much stack space it needs, it will grow the stack if it's not big enough already.

ioquatix commented 1 year ago

How do you allocate the stack? mmap or malloc? Do you have a guard page?

fubark commented 1 year ago

It currently doesn't do it's own allocation, it defers to mimalloc which uses mmap. It checks the bounds for each func call before executing the function. Would guard pages allow skipping the bounds checks? If so that would save more cycles but I don't know how I would handle it if it triggered in the middle of a function call.

ioquatix commented 1 year ago

It currently doesn't do it's own allocation, it defers to mimalloc which uses mmap. It checks the bounds for each func call before executing the function.

Languages like go used to have a segmented stack. I think the computation part is incompatible with native functions and/or recursion. Virtual memory can be better and allocated on demand. Guard page is a little expensive to allocate but you can cache it.

how I would handle it if it triggered in the middle of a function call.

Crash :)

fubark commented 1 year ago

That's interesting, I didn't think to try segmented stack. That would save on copy ops. I read somewhere that the main issue is freeing the segmented stack is slow but I'm also thinking it wouldn't be good for the cache since the memory is all over the place.

ioquatix commented 1 year ago

There are lots of issues with segmented stack design, go would be a good case study of why not to use it. Some language may be more successful with it if they have different trade offs.