-
**Description**
I'm trying to serve an embedding model [FastText] in triton-server using python as its backend. The external dependencies are just fasttext module which is inturn dependent on numpy. …
-
**Bug summary**
LLVM's NVPTX backend seems to have problems with AtomicLoad: https://github.com/llvm/llvm-project/issues/48651
As a result, at least some versions of Clang (Ubuntu's Clang 12 + C…
-
Add common functions like device memory size etc to `device.h`.
Add backend specific features such as cuda/opencl compute, etc to `cuda.h` and `opencl.h`.
-
I'm glad the torch.compile is speeding up very quickly. On A5000 it can speed up 60%, but there's no acceleration at l4. I want to know why is it happen?
Here is my code, you can set --compile when r…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
采用数据并行lora微调,报错如下
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11…
-
Would it be an idea to define the same kernels that exist in the CUDA backend with ThunderKittens as well? They have cool examples with FlashAttention2 and I think it would be interesting to have as a…
-
Hello,
I realize it might be too early for Windows support, but I didn't see an existing issue on this.
In case it hasn't been tested yet, I just wanted to point out that I encountered the following…
-
Hi..... CUDA 9.2 compiles without any issues; however, I get the following when compiling CUDA 8 (for older GPU's). I'm using Windows 10 and VS 2017
"D:\Dropbox\git\xmr-stak\build\install.vcxproj…
-
once i get "you don't have state dict", i can't generate an image with the sd model that is set, even if i complete the state dict, due to "'NoneType' object has no attribute 'sd_checkpoint_info'"
on…
-
I've tried using the Rocm precompiled binary because I have a 7900xtx and the result is the image below.
I also tried compiling but received a lot of errors. I also have Rocm 6.1 installed if it matt…