cuda-kernels Search Results

1000+ results
for cuda-kernels

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

koide3/gtsam_points #27

cmake error

gtsam_points/cuda/kernels/vgicp_derivatives.cuh(49): error: calling a __host__ function("Eigen::MatrixBase< ::Eigen::CwiseBinaryOp< ::Eigen::internal::scalar_sum_op , const ::Eigen::Matrix , const :…

darkdelting updated 1 day ago
3
scikit-hep/awkward #3173

remaining CUDA kernels not yet in main

The following are completed in #3150 for `n = 2`: - [x] awkward_ListArray_combinations - [x] awkward_RegularArray_combinations_64 The following are in completed in #3149: - [x] awkward_reduc…

jpivarski updated 3 months ago
1
Zyphra/transformers_zamba2 #2

`use_mamba_kernels` has no effect?

### System Info After training `Zyphra/Zamba2-1.2B` trying to run inference on CPU but got an error: ``` File "virtual_envs/neural_asr_training/lib/python3.10/site-packages/causal_conv1d/causal…

RuABraun updated 1 month ago
1
ICLDisco/dplasma #98

HIP: support all the same kernels as CUDA

## Description We support a limited subset of kernels with HIP devices compared to CUDA. ### Describe the solution you'd like Every algorithm that is CUDA accelerated should also be HIP accel…

abouteiller updated 1 month ago
2
NVIDIA-AI-IOT/tensorrt_plugin_generator #4

Adding cuda kernels

I was able to create a plugin and this repo really helped with the boiler plate code. One addition that I want to make in my `enqueue` function is that I want to call my cuda binding. How can I make…

sandeepnmenon updated 1 year ago
1
mit-han-lab/TinyChatEngine #71

Assistant spitting out non-readable characters on RTX 4060

``` (TinyChatEngine) zhef@zhef:~/TinyChatEngine/llm$ make chat -j CUDA is available! src/Generate.cc src/LLaMATokenizer.cc src/OPTGenerate.cc src/OPTTokenizer.cc src/utils.cc src/nn_modules/Fp32OPT…

zhefciad updated 3 weeks ago
4
mys007/ecc #6

cuda_kernels help

Hey, I am working deeply on your code. I would like to ask you a favor, and if you could please help me to understand the cuda kernels. My email adress is thomasc@helix.re I have benchmark you…

tchaton updated 5 years ago
5
NVIDIA/CUDALibrarySamples #228

cusparseLtMatmul example is much slower than cublasGemmEx

Hi, guys, I compiled the example code for cusparseLt here: [https://github.com/NVIDIA/CUDALibrarySamples/tree/master/cuSPARSELt/matmul](url), which I used the default problem size, and used Nsight sy…

SimonSongg updated 19 hours ago
6
fattorib/fusedswiglu #1

Wall clock speed is slower than Pytorch primitives

Hi! Thank you for your amazing work! I'm having some trouble on comparing the fused swiglu kernel with its plain pytorch version. I checked the wall clock time with code below, and it gives me l…

rustic-snob updated 1 month ago
1
mit-han-lab/llm-awq #124

can not install awq CUDA kernels

I‘m trying to follow [this](https://github.com/mit-han-lab/llm-awq#install) to install awq. But failed at step 3. ## My Env ``` OS: Windows 11 GPU: NVIDIA GeForce RTX4060 Driver Version: 536.4…

ycyaoxdu updated 4 months ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for cuda-kernels

1000+ results
for cuda-kernels