issues
search
intel
/
xFasterTransformer
Apache License 2.0
355
stars
61
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
1st Token Latency Performance Issue for Datatype w8a8
#477
qiuyuleng1
opened
1 week ago
1
Illegal instruction (core dumped)
#476
wcollin
closed
1 week ago
1
Illegal instruction (core dumped)
#475
zwx109473
closed
1 week ago
3
docs: add Japanese README
#474
eltociear
opened
2 months ago
0
Segmentation fault (core dumped) FIRST_TOKEN_WEIGHT_LOCATION=$1 NEXT_TOKEN_WEIGHT_LOCATION=$2 OMP_NUM_THREADS=$3 numactl -C $cpu_index -p $2 $BENCHMARK
#473
LittleNoob2333
closed
1 month ago
2
Update Broken QR Code Link for WeChat on Wiki Page
#472
w1ida
closed
2 months ago
1
oneDNN and XFT performance
#471
LittleNoob2333
opened
2 months ago
0
add bf16_int8 support for invokeLayerLLaMA API
#470
miaojinc
opened
2 months ago
0
about benchmark Illegal instruction
#469
LittleNoob2333
closed
2 months ago
1
[Kernel] Upgrade xDNN to v1.5.2 and make AMX_FP16 work
#468
pujiang2018
closed
3 months ago
0
[Readme] Update README_CN.md
#467
tianyeeT
closed
2 months ago
0
[Kernel] Make SelfAttention prepared for AMX_FP16; More balanced task split in Cross Attention
#466
pujiang2018
closed
3 months ago
2
[Readme] Add accepted papers
#465
wenhuanh
closed
3 months ago
0
[Layers] Fix invokeAttentionLLaMA API
#464
wenhuanh
closed
3 months ago
1
[Dependency] Bump web_demo requirement.
#463
Duyi-Wang
closed
3 months ago
0
Add env param KV_CACHE_LOCATION to control kv cache memory numanode location
#462
a3213105
opened
3 months ago
2
[Model] Group support for int8/int4 models
#461
xiangzez
opened
3 months ago
0
[Kernel] Cache oneDNN primitive when M < `XFT_PRIMITIVE_CACHE_M`, default 256.
#460
Duyi-Wang
closed
3 months ago
0
[Layers] Enable AMX FP16 of FlashAttn
#459
abenmao
closed
3 months ago
0
[Denpendency] Pin python requirements.txt version.
#458
Duyi-Wang
closed
3 months ago
0
[Bugfix] fixed shm reduceAdd & rope error when batch size is large
#457
abenmao
closed
3 months ago
0
[Feature] Enable AMX FP16 on next generation CPU
#456
wenhuanh
closed
3 months ago
4
[run_benchmark.sh] Few cores are running on HBM when batch-size >16 or 32
#455
hangfu-guo
closed
3 months ago
3
[Version] v1.7.2.
#454
Duyi-Wang
closed
3 months ago
0
[Model] Support hybrid model in continuous batching.
#453
Duyi-Wang
closed
3 months ago
0
[Kernel] Enable continuous batching on single GPU.
#452
changqi1
closed
3 months ago
0
[Tools] Add Baichuan1/2 convert tool
#451
abenmao
closed
3 months ago
0
[Framework] Remove duplicated code
#450
xiangzez
closed
3 months ago
0
[Layers] Add qwenRope support for Qwen1.0 in CB mode
#449
abenmao
closed
3 months ago
2
[Doc] Add vllm benchmark docs.
#448
marvin-Yu
closed
3 months ago
0
[request]qwen1 not supported by vllm-xft
#447
zhm-algo
closed
3 months ago
3
[bug] HBM flat QUAD mode determination method is incorrect
#446
xuyizhan
closed
2 months ago
1
[Version] v1.7.1.
#445
Duyi-Wang
closed
3 months ago
0
Fixed punctuation error in README
#444
denniszhen1
closed
2 months ago
0
Update README.md
#443
denniszhen1
closed
3 months ago
0
Bump gradio from 4.19.2 to 4.36.0 in /examples/web_demo
#442
dependabot[bot]
closed
4 months ago
0
[Model] Fix array out of bounds when rank > 2.
#441
Duyi-Wang
closed
4 months ago
1
Crash when using CB mode with multi-rank
#440
a3213105
closed
4 months ago
0
[Model] Add Qwen2 GPTQ model support
#439
xiangzez
closed
4 months ago
0
Add Continue Batching support for Chatglm2/3
#438
a3213105
closed
4 months ago
1
[Kernel] Expand rmsNorm op.
#437
changqi1
closed
4 months ago
2
[Common]Add INT8/UINT4 to BF16 weight convert
#436
xiangzez
closed
4 months ago
0
[README] Update README.md.
#435
Duyi-Wang
closed
4 months ago
0
[README] Update README.md.
#434
Duyi-Wang
closed
4 months ago
0
[Version] v1.7.0.
#433
Duyi-Wang
closed
4 months ago
0
[Dependency] Fix wrong so path returned in `get_env()`.
#432
Duyi-Wang
closed
4 months ago
0
[README] Update readme.
#431
Duyi-Wang
closed
4 months ago
0
[Dependency] Update libiomp5.so to `5.0.20230815` contained in mkl.
#430
Duyi-Wang
closed
4 months ago
0
[Layers] Fixed error in yarn
#429
abenmao
closed
4 months ago
0
[Layers] Increased the threshold for enabling flashAttn
#428
abenmao
opened
4 months ago
0
Next