-
I would like to inquire about the performance of two kernels:
naive_conv_nonpacked_bwd_nchw_half_double_half
naive_conv_nonpacked_fwd_nchw_half_double_half
When are these used when we call `miope…
-
Using fortran style 1D indexing on the parent, with any required assertions done upstream, might be easiest for some kernels. E.g.:
```julia
function Base.copyto!(
dest::IJFH{S, Nij},
bc:…
-
Hi, I'm benchmarking flashinfer on H100, and I'm running attention for the decoding stage.
I use q_head = kv_head = 40, which is the standard attention for llama 13B.
I tried use_tensor_cores = …
-
### System Info
CPU: X86
Memory size: 2TB
GPU Name: H20
TensorRT-LLM: 0.10.0
OS:Alibaba Cloud Linux release 3 (Soaring Falcon)
GPU Driver:550.54.15
CUDA:cuda_12.4.r12.4/compiler.33961263_0
Do…
-
is there any way I can get this going for the latest raspbian kernels? or instructions of how to compiled this from source?
-
Some distributions have started to move the kernels out of the boot directory by default and let the install hooks etc handle copying them over.
The kernel hooks would then copy the kernels into `/…
-
There exist a number of definitions in `kernels` that aren't part of the kernels themselves, but are used in them. An (inexhaustive and potentially outdated) list of their header files are below:
-…
-
I can't see anything obvious in the changelogs but it looks like at some point after `1.18.9` support for Linux 4.x Kernels was dropped. We currently run some clusters that have a combination of `4.19…
-
Hey @sergisiso,
With nvfortran 24.5, compilation fails with:
```
NVFORTRAN-S-1000-Call in OpenACC region to procedure 'nf90_enddef' which has no acc routine information (.../nemo-4.0_mirror_SI3…
-
Hi,
I have been trying to use MLIR-AIE for Ryzen AI NPU on a Windows laptop, with WSL Ubuntu 22.04, without success so far using the examples in https://github.com/Xilinx/mlir-aie/tree/main/program…