issues
search
flashinfer-ai
/
flashinfer
FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
768
stars
64
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
build raise "cub::BlockAdjacentDifference<__nv_bool, 1024, 1, 1, 860>" has no member "SubtractLeft"
#261
WanBenLe
closed
1 month ago
8
sampling: fused speculative sampling kernels
#259
yzh119
closed
1 month ago
0
[Bug report] BatchPrefillWithPagedKVCachePyTorchWrapper failed to dispatch group_size 3
#258
merrymercy
closed
3 weeks ago
3
[Feature request] Support attention logits cap with tanh
#257
merrymercy
closed
3 weeks ago
5
perf: initial cuda graph support
#256
yzh119
closed
1 month ago
1
bugfix: fix pybind class bindings
#255
yzh119
closed
1 month ago
0
Qwen1.5-32B failed: BatchPrefillWithPagedKVCachePyTorchWrapper failed to dispatch group_size 5
#254
QwertyJack
closed
3 weeks ago
1
perm: use page-locked host memory for auxiliary data structure on CPU
#253
yzh119
closed
1 month ago
0
cmake: backward compatibility for TVM_HOME
#252
yzh119
closed
1 month ago
0
cmake: rename TVM_HOME to TVM_SOURCE_DIR
#251
yzh119
closed
1 month ago
0
Can BatchDecodeWithPaddedKVCache be used in cascade inference?
#250
joey12300
opened
1 month ago
1
CUDA Error: no kernel image is available for execution on the device (209) /tmp/build-via-sdist-nl8se4dx/flashinfer-0.0.4+cu118torch2.2/include/flashinfer/attention/decode.cuh: line 871 at function cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size)
#249
lucasjinreal
opened
1 month ago
2
Circular import error when importing built-from-source flashinfer
#248
vedantroy
opened
1 month ago
1
Fix compile/assert on group_size
#247
Qubitium
closed
3 weeks ago
1
Add group_size 7 and fix compat with Yi 1.5 34b
#246
Qubitium
closed
1 month ago
3
multiple definition of `cuda::__3::pipeline...
#245
jpf888
opened
1 month ago
0
Move -Wno-switch-bool argument to cxx from nvcc
#244
mgerstgrasser
closed
1 month ago
0
Compilation fails due to "-Wno-switch-bool" nvcc flag
#243
mgerstgrasser
closed
1 month ago
0
能否支持Volta/Tesla架构?
#242
alexngng
opened
1 month ago
0
bugfix: Fix dispatcher in src directory
#241
yzh119
closed
1 month ago
0
bugfix: fix the `generate_dispatch_inc` script
#240
yzh119
closed
1 month ago
0
compilation: Suppress switch bool warning
#239
yzh119
closed
1 month ago
0
sampling: expose sampling APIs in pytorch
#238
yzh119
closed
2 months ago
0
Support MLA (Multi-Head Latency Attention) in DeepSeek-v2
#237
yzh119
opened
2 months ago
0
doc: bump documentation version
#236
yzh119
closed
2 months ago
0
cmake: macro trimming
#235
yzh119
closed
1 month ago
0
ci: update release wheel yaml
#234
yzh119
closed
2 months ago
0
fix: remove 8 from default page size
#233
yzh119
closed
2 months ago
0
chore(main): release 0.0.5
#232
github-actions[bot]
closed
2 weeks ago
1
fix: fix macro to suppress compilation warning
#231
yzh119
closed
2 months ago
0
Revert "ci: remove multi-threading in nvcc compile flags (#229)"
#230
yzh119
closed
2 months ago
0
ci: remove multi-threading in nvcc compile flags
#229
yzh119
closed
2 months ago
0
bugfix: fix MANIFEST.in
#228
yzh119
closed
2 months ago
0
Support torch 2.3
#227
rkooo567
closed
2 months ago
3
bugfix: fix the potential issue of sampling kernels
#226
yzh119
closed
2 months ago
0
bugfix: Fix the correctness issue of sampling kernel
#225
yzh119
closed
2 months ago
0
Fix implicit cast in sampling
#224
abcdabcd987
closed
2 months ago
0
support versatile gqa size for batch prefill
#223
xuzhenqi
closed
1 month ago
3
TypeError: get_cu_file_str() missing 1 required positional argument: 'idtype'
#222
xuzhenqi
closed
2 months ago
1
bugfix: fix sampler's implementation bug when dtype is not float32
#221
yzh119
closed
2 months ago
0
cmake: fix cmake files
#220
yzh119
closed
2 months ago
0
misc: make max_top_p/k_rounds a input argument instead of template parameter
#219
yzh119
closed
2 months ago
0
fix: revert #144
#218
yzh119
closed
2 months ago
1
[BugFix] Fix build error related to dispatch page size
#217
esmeetu
closed
2 months ago
1
ci: add pytorch 2.3 to matrix
#216
yzh119
closed
2 months ago
0
[TVMWrapper] Add wrapper functions for sampler
#215
MasterJH5574
closed
2 months ago
0
misc: parallel sampling from probability
#214
yzh119
closed
2 months ago
0
sampling: support parallel top-p sampling
#213
yzh119
closed
2 months ago
0
perm: optimize sampling performance
#212
yzh119
closed
2 months ago
0
sampling: fix alignment issue for vocab_size not divisible by vec_size
#211
yzh119
closed
2 months ago
0
Previous
Next