intel xFasterTransformer issues

intel / xFasterTransformer

Apache License 2.0

268 stars 52 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Kernel] Make SelfAttention prepared for AMX_FP16; More balanced task split in Cross Attention

#466 pujiang2018 opened 20 hours ago
1
[Readme] Add accepted papers

#465 wenhuanh closed 1 day ago
0
[Layers] Fix invokeAttentionLLaMA API

#464 wenhuanh closed 1 day ago
1
[Dependency] Bump web_demo requirement.

#463 Duyi-Wang closed 3 days ago
0
Add env param KV_CACHE_LOCATION to control kv cache memory numanode location

#462 a3213105 opened 1 week ago
2
[Model] Group support for int8/int4 models

#461 xiangzez opened 1 week ago
0
[Kernel] Cache oneDNN primitive when M < `XFT_PRIMITIVE_CACHE_M`, default 256.

#460 Duyi-Wang closed 1 week ago
0
[Layers] Enable AMX FP16 of FlashAttn

#459 abenmao closed 3 days ago
0
[Denpendency] Pin python requirements.txt version.

#458 Duyi-Wang closed 1 week ago
0
[Bugfix] fixed shm reduceAdd & rope error when batch size is large

#457 abenmao closed 2 weeks ago
0
[Feature] Enable AMX FP16 on next generation CPU

#456 wenhuanh closed 1 week ago
4
[run_benchmark.sh] Few cores are running on HBM when batch-size >16 or 32

#455 hangfu-guo closed 2 weeks ago
3
[Version] v1.7.2.

#454 Duyi-Wang closed 2 weeks ago
0
[Model] Support hybrid model in continuous batching.

#453 Duyi-Wang closed 2 weeks ago
0
[Kernel] Enable continuous batching on single GPU.

#452 changqi1 closed 2 weeks ago
0
[Tools] Add Baichuan1/2 convert tool

#451 abenmao closed 2 weeks ago
0
[Framework] Remove duplicated code

#450 xiangzez closed 2 weeks ago
0
[Layers] Add qwenRope support for Qwen1.0 in CB mode

#449 abenmao closed 2 weeks ago
2
[Doc] Add vllm benchmark docs.

#448 marvin-Yu closed 3 weeks ago
0
[request]qwen1 not supported by vllm-xft

#447 zhm-algo closed 2 weeks ago
3
[bug] HBM flat QUAD mode determination method is incorrect

#446 xuyizhan opened 3 weeks ago
0
[Version] v1.7.1.

#445 Duyi-Wang closed 3 weeks ago
0
Fixed punctuation error in README

#444 denniszhen1 opened 3 weeks ago
0
Update README.md

#443 denniszhen1 closed 3 weeks ago
0
Bump gradio from 4.19.2 to 4.36.0 in /examples/web_demo

#442 dependabot[bot] closed 4 weeks ago
0
[Model] Fix array out of bounds when rank > 2.

#441 Duyi-Wang closed 4 weeks ago
1
Crash when using CB mode with multi-rank

#440 a3213105 closed 4 weeks ago
0
[Model] Add Qwen2 GPTQ model support

#439 xiangzez closed 4 weeks ago
0
Add Continue Batching support for Chatglm2/3

#438 a3213105 closed 4 weeks ago
1
[Kernel] Expand rmsNorm op.

#437 changqi1 closed 4 weeks ago
2
[Common]Add INT8/UINT4 to BF16 weight convert

#436 xiangzez closed 1 month ago
0
[README] Update README.md.

#435 Duyi-Wang closed 1 month ago
0
[README] Update README.md.

#434 Duyi-Wang closed 1 month ago
0
[Version] v1.7.0.

#433 Duyi-Wang closed 1 month ago
0
[Dependency] Fix wrong so path returned in `get_env()`.

#432 Duyi-Wang closed 1 month ago
0
[README] Update readme.

#431 Duyi-Wang closed 1 month ago
0
[Dependency] Update libiomp5.so to `5.0.20230815` contained in mkl.

#430 Duyi-Wang closed 1 month ago
0
[Layers] Fixed error in yarn

#429 abenmao closed 1 month ago
0
[Layers] Increased the threshold for enabling flashAttn

#428 abenmao opened 1 month ago
0
[Python] Add `get_env()` to get LD_PRELOAD set.

#427 Duyi-Wang closed 1 month ago
0
[CI] Check gcc version.

#426 changqi1 closed 1 month ago
0
[Kernel] Add dynamic onednn matmul.

#425 changqi1 opened 1 month ago
0
[Layers] Fixed the seg fault error when running with more than 4 ranks

#424 abenmao closed 1 month ago
0
[COMM] Fix bugs of core dump && hang when running cross nodes

#423 abenmao closed 1 month ago
0
[xDNN] Release v1.5.1.

#422 changqi1 closed 1 month ago
0
[Distribute] Add distribute support for continuous batching api.

#421 Duyi-Wang closed 1 month ago
2
[Kernel] Less compute for Self-Attention (Q * K)

#420 pujiang2018 closed 1 month ago
0
gcc8.2 编译报错

#419 bukejiyu closed 4 weeks ago
3
Add --padding and fix bug

#418 yangkunx closed 1 month ago
0
[Kernel] Add oneDNN AMX_FP16 compute kernels.

#417 changqi1 closed 1 month ago
1