-
### Proposed new feature or change:
This is to follow up the discussion in the mailing list.
From my POV, sum for medium sized arrays (50k+) often becomes a bottleneck in greedy algorithms that …
dg-pb updated
5 months ago
-
We're leaving a lot of resources idle right now because of packing inefficiency. In the GPU case, where it matters most, we have too many high memory jobs.
Example:
```
pilot announces: 10 cpu, …
-
While working on issue #11844, I realized that there's something a little funny about our `hasSingleLocalSubdomain()` routine's definition. Specifically, it's currently `param` which is based on thin…
-
I would like to enable MPI unit testing and doctesting, with IDE integration and command-line tooling. I found the [rusty-fork](https://docs.rs/rusty-fork/latest/rusty_fork/) library, which I think is…
-
**Describe the issue**:
Dask `Client` with `threads_per_worker=1` does not limit all thread types, so it can be violated for example by sklearn FastICA via OpenMP.
**Minimal Complete Verifia…
-
### Nomad version
Client: 1.4.3
Server: 1.4.4
### Operating system and Environment details
```
root@ns5020866:/home/ubuntu# nomad operator scheduler get-config
Scheduler Algorithm = …
-
**Describe the bug**
The speed of ort + onednn is much slower than ort + cpu. Have you ever known this problem? Ths
**System information**
onnxruntime-dnnl 1.12.0
-
Hello, I have read your thesis and code and I think your idea is great! However, I have a question. Since the introduction of Stream-Ordered Memory Allocator in CUDA 11.2, cudaMallocAsync and cudaFree…
-
This looks like a fantastically useful library.
Just a point for consideration, given that one of the targets is to make it easier to write fast code:
https://blog.cloudflare.com/on-the-dangers-of…
-
We can use our existing engine, this would be running inside it - whether as some iframe or say some "plugin" to the existing engine is tbd.