-
We currently have limited-to-no support for using `complex` numbers in a GPU kernel.
I tried a number of simple cases with complex numbers to get an idea of where our support is at. All tests are d…
-
### What happened?
After using 'cmake --build build --config release' command on Ascend 310P3,it can not compile succesfully
![image](https://github.com/user-attachments/assets/74ef8d67-e859-4502-ac…
-
### System Info
After training `Zyphra/Zamba2-1.2B` trying to run inference on CPU but got an error:
```
File "virtual_envs/neural_asr_training/lib/python3.10/site-packages/causal_conv1d/causal…
-
@CheukHinHoJerry -- I think before we can get rid of ACE1 and ACE1x we may want to implement at least the 2B purification in the new implementation. Do you have any thoughts on that? Can we both thin…
-
When using MarlinInt4WeightQBitsTensor and its associated optimized gemm kernel, there are issues with the weight/scales/zero-point readback as soon as parallelization increases.
The consequence i…
-
**What is your question?**
Hello!
I’ve been exploring the Cutlass examples for GEMM and Convolution and noticed the use of double buffering.
https://developer.nvidia.com/blog/cutlass-linear-algebra-…
-
Hi there!
you wrote:
> NOTE: I had to keep the kernel locked to the original (i.e. block kernel upgrades) due to kernel panic with newer kernels, not sure why, YMMV.
Are you referring to `5.1…
fockr updated
1 month ago
-
This is an issue at commit: 87ee0b46b834f67bad9025d4a82ed5654f3403d3
I tried enabling GCC LTO for PyTorch in this PR: https://github.com/pytorch/pytorch/pull/137866 and hit this warning that is treat…
-
Presently our zone kernels do not provide everything needed to build the nvidia driver. So we wind up using a host kernel image instead with them.
-
After I installed a newer kernel, the script will directly remove that new kernel as well.
Could you improve the script even further, so it will not remove the current kernel + NEWER.
_Workaroun…