-
# Summary
GpuIndexIVFScalarQuantizer with scalar quantizers that require shared memory on the GPU don't seem to work for k >= 1024 in Faiss 1.7.4. See the small reproduction script at the bottom.
…
-
环境
If applicable, please include the following:
CPU architecture: x86_64
GPU properties
GPU name: NVIDIA A10
Clock frequencies used: None
Libraries
TensorRT branch: 9.0.0
TensorRT LLM: 0.1.3…
-
https://godbolt.org/z/bccGETacM
Kernel code is:
```c
const long int size = 64;
__global__ void cupy_scan_naive(int *out)
{
__shared__ int smem1[size];
const int lane_id = threadIdx.…
-
Hello,
I'd like to report a potential hazard that occurs in the epilogue when `kUseVarSeqLen=true`. Under this condition, it utilizes `write_tiled()` instead of `write_tma()` to write `O` to the gm…
-
build on h20
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Buil…
-
Hi . I am trying to install and use mamba but i cant install causal-conv1d with pip then I tried to build it from source but I get same error .please help me .
Building wheel for causal-conv1d (s…
-
Not sure if this is the right place for this question-- I couldn't find contact info on the site, so I figured I'd make an issue.
I'm working with a group that's trying to use the FMDIndex (along wit…
-
```
if constexpr (!Is_causal) { // Just masking based on col
if (int(get(tScS(i))) >= int(seqlen_k - n_block * kBlockN)) {
tSrS(i) = -I…
-
While analysing alfalfa (cfr. https://github.ugent.be/ComputationalBiology/alfalfa/tree/smems), we noticed that PerfExpert totally ignores (SSE) integer computations.
Are there any plans to also cons…
-
### Bug description
When running the provided code as a standalone executable, a CUDA illegal memory access is reported. Using compute-sanitizer, I could pinpoint this to an illegal shared memory a…