WebAssembly / wasi-threads

139 stars 8 forks source link

Query the maximum available parallelism #31

Open abrown opened 1 year ago

abrown commented 1 year ago

For some algorithms, in order to partition work on several threads, we need to know how many threads are available. WebAssembly hosts may decide to limit the amount of parallelism available to a wasi-threads module; in other words, the host could stop spawning threads at some threshold as a way to limit a module's resource consumption. In this scenario, I can think of two options for discovering how many threads are possible:

This second option seems to me to be more flexible (doesn't require spawn-ing and the associated counting machinery). Now, it could be nice to differentiate between a) the "maximum number of threads spawn-able" and b) the "number of threads that will run concurrently." One could imagine a host allowing many spawned threads but only allocating a limited number of cores on which to run them. To accommodate both a) and b) we could expose two functions instead of one; I'm not sure how useful a) is, though, since b) is likely what one wants to know to properly partition work.

A function implementing a) could look like:

/// Return the maximum number of threads that will run concurrently on this host. This threshold
/// is provided for optimal partitioning of parallel work and could be artificially limited by the
/// host.
max-concurrency: func() -> u32

Any thoughts on adding such a function? Or should we query limits via "spawn until error"?

sbc100 commented 1 year ago

Some kind of API like JS's navigator.hardwareConcurrency would make sense to me, but I don't think that should be the upper limit of the number of threads that can be created. Shouldn't it be possible to run multi-threaded apps even on system with a single hardware core?

abrown commented 1 year ago

Sure. And, yeah, navigator.hardwareConcurrency is what I'm getting at with max-parallelism; "concurrency" is the right term to use here, though, so I'm going to edit the issue description to read max-concurrency.

It sounds like you are leaning toward b), exposing a function that will "number of threads that will run concurrently" but do you also see a need for a), a separate function to return "maximum number of threads spawn-able"?

sbc100 commented 1 year ago

I don't think "maximum number of threads spawn-able" is very useful, or practical to implement. It doesn't map to any underlying OS primitive that I know of. I guess it would fall into a generate category of resource limits along with things like "max open file descriptors", which WASI also doesn't currently support.

Actually I was wrong, on linux you can do sysctl(KERN_MAX_THREADS) to get this information. But I'm not sure this is useful to expose this via a WASI API. If we did, it would probably make sense to design a sysctl-like API for discovering system limits. Actually since its a constant it guess it would make sense to be imported directly as a wasm global rather than a function. I'm not sure there is precident for that yet in WASI, but it sounds useful.

yamt commented 1 year ago

some apps are using emscripten_num_logical_cores, which iirc maps to navigator.hardwareConcurrency, as a hint to decide the default number of threads. cf. https://github.com/ffmpegwasm/x264/commit/266eb136adc29f80669d35d0564cd7ae7a0bd29d i guess it makes sense to provide an equivalent.

otoh, i agree the max number of threads is not that useful.