-
### Report of performance regression
I found the attention (flashattn.py) computation time increased 1.7x after upgrade vllm 0.6.0 to 0.6.3.
| | v0.6.0 | v0.6.3 |
| :----: | :----: | :----: |
…
-
**Target:** Figure out if by default Torch `scaled_dot_product_attention` when executing in Hopper architecture, thus `PLATFORM_SUPPORTS_CUDNN_ATTENTION` and `SM90OrLater` evaluate to `True`, executes…
-
I had an issue in one of the services I work on, where it would use more and more memory until crashing. After some digging around I was able to reduce it to the following script:
```python
import a…
-
**Description**
A clear and concise description of what the bug is.
I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…
-
Hi, when I use `pip install git+https://github.com/tatsy/torchmcubes.git` to install, it will cause:
I am using Python 3.12.2 and CUDA 12.2.
```bash
pip install git+https://github.com/tatsy/tor…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…
-
### Description
When comparing the output of scipy.spatial.distance_matrix and cupyx.scipy.spatial.distance_matrix, the output is different depending on whether or not I copy the numpy arrays before …
-
### 🐛 Describe the bug
After running the help command after running the install script i get the following error
```
(.venv) [byjlw@devvm2168.cco0 ~/local/torchchat (main)]$ python3 torchchat.py --…
byjlw updated
2 months ago
-
## Bug Report
@trilinos/stokhos @etphipp
### Description
The `Stokhos_TpetraCrsMatrixUQPCEUnitTest_Cuda_MPI_4` unit test fails in cuda/11.4.2 builds with the following output:
```
...
317: Cu…
-
[model_input.txt](https://github.com/user-attachments/files/17492230/model_input.txt)
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment inform…