-
## Issue Overview
Hey @jonwomack, thanks for open sourcing your work! I'm eager to see how this could help me in my export process, however, I'm running into some issues with it. I've followed your…
-
bauom updated
5 months ago
-
The following TTGIR currently fails with CUDA_ERROR_ILLEGAL_ADDRESS.
- The configuration used for this test case is `{"block_m":16,"block_n":16,"block_k":16,"split_k":1,"num_stages":2,"num_warps":2…
-
### System Info
```Shell
I'm encountering a "Shared memory manager connection has timed out" error while training my model. The error occurs during the data loading process, specifically when trying …
-
Right now we allocate large contiguous areas for wasm from a separate reserve.
With memfd we can stitch together some 128kB allocations from our main arena and give them to wasm.
Demonstration code:…
-
Thanks to @cridenour the plugin now supports in-memory shared databases (see [here](https://www.sqlite.org/inmemorydb.html)).
There are however some missing functionalities that could be worth impl…
-
### 🐛 Describe the bug
When I use flex attention on one RTX 4090, I got some error.
A minimal repro:
```python
import torch
from torch.nn.attention.flex_attention import flex_attention
flex_at…
-
Running on 4xH100 as specified in readme:
```
Traceback (most recent call last): …
-
### Suggestion Description
When I paint with ComfyUI, if the workflow is more complex, more models need to be loaded, or larger images are generated, there is a **HIP out of memory**, even if there i…
-
### Component(s)
receiver/hostmetrics
### Describe the issue you're reporting
If https://github.com/open-telemetry/semantic-conventions/pull/933 is merged we will need to change the receiver implem…