intel xFasterTransformer issues

intel / xFasterTransformer

Apache License 2.0

355 stars 61 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

1st Token Latency Performance Issue for Datatype w8a8

#477 qiuyuleng1 opened 1 week ago
1
Illegal instruction (core dumped)

#476 wcollin closed 1 week ago
1
Illegal instruction (core dumped)

#475 zwx109473 closed 1 week ago
3
docs: add Japanese README

#474 eltociear opened 2 months ago
0
Segmentation fault (core dumped) FIRST_TOKEN_WEIGHT_LOCATION=$1 NEXT_TOKEN_WEIGHT_LOCATION=$2 OMP_NUM_THREADS=$3 numactl -C $cpu_index -p $2 $BENCHMARK

#473 LittleNoob2333 closed 1 month ago
2
Update Broken QR Code Link for WeChat on Wiki Page

#472 w1ida closed 2 months ago
1
oneDNN and XFT performance

#471 LittleNoob2333 opened 2 months ago
0
add bf16_int8 support for invokeLayerLLaMA API

#470 miaojinc opened 2 months ago
0
about benchmark Illegal instruction

#469 LittleNoob2333 closed 2 months ago
1
[Kernel] Upgrade xDNN to v1.5.2 and make AMX_FP16 work

#468 pujiang2018 closed 3 months ago
0
[Readme] Update README_CN.md

#467 tianyeeT closed 2 months ago
0
[Kernel] Make SelfAttention prepared for AMX_FP16; More balanced task split in Cross Attention

#466 pujiang2018 closed 3 months ago
2
[Readme] Add accepted papers

#465 wenhuanh closed 3 months ago
0
[Layers] Fix invokeAttentionLLaMA API

#464 wenhuanh closed 3 months ago
1
[Dependency] Bump web_demo requirement.

#463 Duyi-Wang closed 3 months ago
0
Add env param KV_CACHE_LOCATION to control kv cache memory numanode location

#462 a3213105 opened 3 months ago
2
[Model] Group support for int8/int4 models

#461 xiangzez opened 3 months ago
0
[Kernel] Cache oneDNN primitive when M < `XFT_PRIMITIVE_CACHE_M`, default 256.

#460 Duyi-Wang closed 3 months ago
0
[Layers] Enable AMX FP16 of FlashAttn

#459 abenmao closed 3 months ago
0
[Denpendency] Pin python requirements.txt version.

#458 Duyi-Wang closed 3 months ago
0
[Bugfix] fixed shm reduceAdd & rope error when batch size is large

#457 abenmao closed 3 months ago
0
[Feature] Enable AMX FP16 on next generation CPU

#456 wenhuanh closed 3 months ago
4
[run_benchmark.sh] Few cores are running on HBM when batch-size >16 or 32

#455 hangfu-guo closed 3 months ago
3
[Version] v1.7.2.

#454 Duyi-Wang closed 3 months ago
0
[Model] Support hybrid model in continuous batching.

#453 Duyi-Wang closed 3 months ago
0
[Kernel] Enable continuous batching on single GPU.

#452 changqi1 closed 3 months ago
0
[Tools] Add Baichuan1/2 convert tool

#451 abenmao closed 3 months ago
0
[Framework] Remove duplicated code

#450 xiangzez closed 3 months ago
0
[Layers] Add qwenRope support for Qwen1.0 in CB mode

#449 abenmao closed 3 months ago
2
[Doc] Add vllm benchmark docs.

#448 marvin-Yu closed 3 months ago
0
[request]qwen1 not supported by vllm-xft

#447 zhm-algo closed 3 months ago
3
[bug] HBM flat QUAD mode determination method is incorrect

#446 xuyizhan closed 2 months ago
1
[Version] v1.7.1.

#445 Duyi-Wang closed 3 months ago
0
Fixed punctuation error in README

#444 denniszhen1 closed 2 months ago
0
Update README.md

#443 denniszhen1 closed 3 months ago
0
Bump gradio from 4.19.2 to 4.36.0 in /examples/web_demo

#442 dependabot[bot] closed 4 months ago
0
[Model] Fix array out of bounds when rank > 2.

#441 Duyi-Wang closed 4 months ago
1
Crash when using CB mode with multi-rank

#440 a3213105 closed 4 months ago
0
[Model] Add Qwen2 GPTQ model support

#439 xiangzez closed 4 months ago
0
Add Continue Batching support for Chatglm2/3

#438 a3213105 closed 4 months ago
1
[Kernel] Expand rmsNorm op.

#437 changqi1 closed 4 months ago
2
[Common]Add INT8/UINT4 to BF16 weight convert

#436 xiangzez closed 4 months ago
0
[README] Update README.md.

#435 Duyi-Wang closed 4 months ago
0
[README] Update README.md.

#434 Duyi-Wang closed 4 months ago
0
[Version] v1.7.0.

#433 Duyi-Wang closed 4 months ago
0
[Dependency] Fix wrong so path returned in `get_env()`.

#432 Duyi-Wang closed 4 months ago
0
[README] Update readme.

#431 Duyi-Wang closed 4 months ago
0
[Dependency] Update libiomp5.so to `5.0.20230815` contained in mkl.

#430 Duyi-Wang closed 4 months ago
0
[Layers] Fixed error in yarn

#429 abenmao closed 4 months ago
0
[Layers] Increased the threshold for enabling flashAttn

#428 abenmao opened 4 months ago
0