-
### Report of performance regression
I found the attention (flashattn.py) computation time increased 1.7x after upgrade vllm 0.6.0 to 0.6.3.
| | v0.6.0 | v0.6.3 |
| :----: | :----: | :----: |
…
-
Multiplying a `CuSparseMatrixCSC` and a `CuSparseVector` vector returns a `CuArray`, instead of a `CuSparseVector`.
```
using CUDA
using CUDA.CUSPARSE
using Random
using SparseArrays
Random.see…
-
I had an issue in one of the services I work on, where it would use more and more memory until crashing. After some digging around I was able to reduce it to the following script:
```python
import a…
-
### 🐛 Describe the bug
When i execute `tune run lora_finetune_single_device --config xxx.yaml`, with the `yaml` file is:
```
# Logging
output_dir: finetune/model-dir/Qwen2.5-0.5B-Instruct-finetu…
-
**Description**
A clear and concise description of what the bug is.
I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…
-
Hi, when I use `pip install git+https://github.com/tatsy/torchmcubes.git` to install, it will cause:
I am using Python 3.12.2 and CUDA 12.2.
```bash
pip install git+https://github.com/tatsy/tor…
-
## Bug Report
@trilinos/stokhos @etphipp
### Description
The `Stokhos_TpetraCrsMatrixUQPCEUnitTest_Cuda_MPI_4` unit test fails in cuda/11.4.2 builds with the following output:
```
...
317: Cu…
-
### Description
When comparing the output of scipy.spatial.distance_matrix and cupyx.scipy.spatial.distance_matrix, the output is different depending on whether or not I copy the numpy arrays before …
-
### Before You Report a Bug, Please Confirm You Have Done The Following...
- [X] I have updated to the latest version of the packages.
- [x] I have searched for both [existing issues](https://github.…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…