-
**Target:** Figure out if by default Torch `scaled_dot_product_attention` when executing in Hopper architecture, thus `PLATFORM_SUPPORTS_CUDNN_ATTENTION` and `SM90OrLater` evaluate to `True`, executes…
-
Multiplying a `CuSparseMatrixCSC` and a `CuSparseVector` vector returns a `CuArray`, instead of a `CuSparseVector`.
```
using CUDA
using CUDA.CUSPARSE
using Random
using SparseArrays
Random.see…
-
I had an issue in one of the services I work on, where it would use more and more memory until crashing. After some digging around I was able to reduce it to the following script:
```python
import a…
-
### Bug description
I have a sharded checkpoint which was saved via `trainer.save_checkpoint("/path/to/cp/dir/", weights_only=False` which I am trying to load during test via `trainer.test(dataloade…
-
### 🐛 Describe the bug
version:
python 3.10
torch 2.1.0
cuda 12.1
put breakpoint in row x = x.detach()
```python
import torch
import torchvision
class BoQ_DinoV2(torch.nn.Module):
…
-
### System Info
Hello I am trying to load Mistral-Nemo Instruct-2407 in bnb 4bit on 4 A10 gpus on ec2 instance.
I upgraded all the packages.
Still I face cuda memory out of error when train batc…
-
hello, i have tried to use megablocks in V100 + pytorch2.4.0+cu121, but get error with "cannot support bf16". If i use megablocks in fp32, i get error "group gemm must use bf16". So i change my enviro…
-
Hi, when I use `pip install git+https://github.com/tatsy/torchmcubes.git` to install, it will cause:
I am using Python 3.12.2 and CUDA 12.2.
```bash
pip install git+https://github.com/tatsy/tor…
-
**Description**
A clear and concise description of what the bug is.
I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…