Test is totally dependent on default stack size - which varies a lot!

johnrs commented 1 year ago

I don't believe that your test, as is, is meaningful. It's like comparing the size or compile time for programs which just say Hello World. That just establishes a baseline. The more important thing is how fast they grow as you add code.

The space required to open a fiber or thread is mostly it's stack. If you start out with a trivially small stack then the program will have to grow the stack it to run a reasonable task. Also, there's run-time overhead doing that.

What's the optimum starting stack size? Obviously there's no consensus. Go uses 2KB. If Rust is 12x smaller, then it must to be using a stack less than 200 bytes. It that enough for the normal case?

Last I looked, the default for a Linux thread was 2MB. Starting just 100K of those would take 200GB. But Linux threads don't have stacks that dynamically size. So they have to plan for worst case. Even so, 2MB is a LOT larger than 200B or 2KB.

I suggest that you put some reasonable code in your fibers, and run them. Perhaps create, then pass a 1KB buffer (value, not pointer) to another function. I think the results will be quite different then, and more meaningful.

johnrs commented 1 year ago

P.S. I believe that some languages also scale down the stack size after a while, if it's more than is being used.

oleid commented 1 year ago

I think there is a misconception about async rust. If I'm not mistaken the individual tasks don't get their own stack. Thus, you need less memory when creating lots of them. The default stack size of rust is 2 MiB per thread.

I suggest that you put some reasonable code in your fibers, and run them. Perhaps create, then pass a 1KB buffer (value, not pointer) to another function.

Wether this works highly depends on compiler optimization. One needs to be careful.

johnrs commented 1 year ago

Rust: That’s interesting. But as soon as they do something meaningful, they will need stack space. I thought that Async used fibers, not threads. Ahhh. I was thinking of Rust's Tokio. That would be the equivalent of Go's goroutines. Callbacks, Futures, and Async are all a very limited form of concurrency.

Code: Yes, but it isn’t hard into fooling the compiler into not optimizing it out of existence. For example, you could test a result and print something if the test fails, for example. Just write the code so that it doesn’t fail. Yes, the test will take a bit of runtime, but it should be far less than the time to set up another fiber or thread.

oleid commented 1 year ago

Rust: That’s interesting. But as soon as they do something meaningful, they will need stack space. I thought that Async used fibers, not threads. Ahhh. I was thinking of Rust's Tokio. That would be the equivalent of Go's goroutines.

The stack is per-thread, not per task. Async-std and Tokio use a thread-pool to run the tasks. Tokio also providers a single-threaded executor, however, that one isn't used here. The async functions called from the task essentially compile down to a state machine.

johnrs commented 1 year ago

That's not general purpose concurrency, like Go and some others have. It's comparing apples to oranges.

darren-clark commented 1 year ago

That's not general purpose concurrency, like Go and some others have. It's comparing apples to oranges.

It's any sort of non thread-per-task concurrency. Which AFAIK is pretty much all of them, including Go, except where you're explicitly creating a thread. As the author noted, at a certain point he was unable to continue running the explicit thread creation cases, probably for exactly this reason.

Note the C# case forces a thread switch unnecessarily, but it's to a thread pool thread.

johnrs commented 1 year ago

In Go you don't need to create threads. The goroutine runtime takes care not only of concurrency, but also parallel (multi-core/cpu), for you. The "main" function is executed in a goroutine. All the rest you set up, as needed. There's no way to do that without each goroutine having it's own stack.

Besides async, co-routines are another form of concurrency which some languages offer. Both async and co-routines are limited to a few patterns. True concurrency doesn't impose the limited environments that they have.

Fibers (also known as green threads, user threads, etc) are definitely different than threads (operating system threads). But to the user they seem to work the same, but fibers have a lot less switching and stack overhead. Note: both have stacks.

You can't have general purpose concurrency without a way of saving state - both for runtime or operating system info, and for your task's info. Imagine you are 5 function calls deep in a task on a fiber, when you get blocked. That call stack has to be restored when you resume. Go tried a few different default stack sizes, then settled on a starting size of 2KB. But to allow for tasks which grow their stack needs, and possibly shed them later, Go can expand and shrink the stack size.

Have you heard about red and blue functions? Languages without general concurrency use these as a kludge - a really irritating one! In Go you just accept blocking. If you want to be doing something else while blocked then do it in another goroutine. That's what they are for!

This is the same as you would do with operating system threads - except that the higher switching overhead makes it inefficient to do this for small things, and the number of threads is severely limited by stacks that are 1000 (for Linux) times larger by default. So operating system threads tend to be used only for big tasks and small tasks are out of luck. Hence fibers!

Benchmarking who can do nothing the fastest isn't meaningful. Opening up 1M web user connections, even if they are idle after being opened, would be interesting. And doing it in a way that remembers user state, saving a lot of "stateless" communications, would be even more interesting.

pkolaczk / async-runtimes-benchmarks

Test is totally dependent on default stack size - which varies a lot! #1