-
I want to use CUDA instead of CPU to increase the speed on tag inference.
My machine Ubuntu 22.04.3 LTS (GNU/Linux 6.5.0-35-generic x86_64), CUDA 12.2
I learned from https://onnxruntime.ai/docs/…
-
### Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
### Branch name
main
### Commit ID
1621313c0f6dcd82fb67d4d00e0f9552c814361a
### Other environment infor…
-
I’m working on CuPy’s CUDA Graph conditional node support and considering adding a more user-friendly graph constructing API to CuPy.
I have two possible plans to achieve this goal:
- (A) “with”…
so298 updated
2 months ago
-
PyTorch now has some support for representing varlen sequences. It is supported to some extent by HF:
- https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transfor…
-
### Describe the bug
The required subgroup size kernel attribute is incorrectly reported on cuda and hip devices. When checking what the compile subgroup size is of a kernel that had the required s…
-
### 🐛 Describe the bug
I can successfully export the vulkan pte. When I run the model with
```sh
./backends/vulkan/vulkan_executor_runner --model_path /scratch/models/vulkan_mobilenetv2.pte
```…
-
### 🐛 Describe the bug
I know this is strange behavior, but actually, I have set the `mannual_seed` each time.
BTW, if I use `cudagraphs` on `cuda` device, there is no inconsistency.
```pytho…
-
## 🐛 Bug
The service started based on Meta-Llama-3.1-70B-Instruct fp8 will crash when running a large concurrency.
## To Reproduce
### convert model
refer this issue: #2982
### start s…
-
Does TF Serving support CUDA graphs?
-
### 1. System information
- OS Platform and Distribution: Ubuntu 22.04
- TensorFlow installation (pip package or built from source): pip
- TensorFlow library (version, if pip package or github SH…