issues
search
intel
/
xFasterTransformer
Apache License 2.0
348
stars
60
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[model] Add deepseek model.
#274
marvin-Yu
closed
5 months ago
0
Issue qwen72b seq length
#273
a3213105
closed
5 months ago
1
[Common] Unify memory allocation into xft::alloc
#272
pujiang2018
closed
5 months ago
0
[Include] Fix include not work.
#271
Duyi-Wang
closed
6 months ago
0
[include] Organize include file
#270
wenhuanh
closed
6 months ago
0
4 sockets qwen execute question
#269
Storm0921
closed
5 months ago
13
[Kernel] increase parallelism for KV cache copy in self attention
#268
pujiang2018
closed
6 months ago
0
[Kernel] Fix the incorrect computing which should be in float, but was in integer
#267
pujiang2018
closed
6 months ago
0
[Layer] Reduce repeated sin and cos embedding table data to optimize ROPE perf.
#266
changqi1
closed
6 months ago
0
[Layer] Use flash attention when larger than threshold ('>=' to '>')
#265
pujiang2018
closed
6 months ago
0
gpu kernel: rms_norm
#264
aurora327
closed
6 months ago
0
[Benchmark] Modify CPU affinity logic, add CI prompt output.
#263
marvin-Yu
closed
6 months ago
0
[Version] v1.4.0.
#262
Duyi-Wang
closed
6 months ago
0
[Benchmark] Fix typo in benchmark script.
#261
Duyi-Wang
closed
6 months ago
0
[Search] Sync smaple result in multi-rank.
#260
Duyi-Wang
closed
6 months ago
0
[Model] Add gemma model support.
#259
marvin-Yu
closed
5 months ago
3
[Attention Kernel/Layer] group attention support in full-link BF16 path; attention layer refactor
#258
pujiang2018
closed
6 months ago
0
[Benchmark] Update model cfg for transformers>4.36.
#257
Duyi-Wang
closed
6 months ago
0
typo in benchmark script
#255
sssssux
closed
6 months ago
1
[Serving] Fix fail to set pad_token_id when it's not None in single mode.
#254
Duyi-Wang
closed
6 months ago
0
[Kernel] Add oneDNN GPU kernels.
#253
changqi1
closed
3 months ago
0
[layers] Add bf16-type input/output support for flash attention
#252
abenmao
closed
6 months ago
0
Fix Opt issue
#251
xiangzez
closed
6 months ago
0
support vllm?
#250
leiwen83
closed
3 months ago
2
[Fix] Fall back to float to bypass issues with MQA/GQA.
#249
marvin-Yu
closed
6 months ago
1
[Docs] Initial documents.
#248
Duyi-Wang
closed
6 months ago
0
Illegal instruction (core dumped)
#247
wswsmao
closed
6 months ago
5
[Dependency] Update web demo requirement.
#246
Duyi-Wang
closed
6 months ago
0
[Kernel] Set USE_AMX_M to 1.
#245
Duyi-Wang
closed
6 months ago
0
Error install xfastertransformer on CentOS 7.6
#244
bin1guo
closed
6 months ago
1
[Example] Add llama2 chat support in Cli demo.
#243
Duyi-Wang
closed
6 months ago
0
fix issue #220
#242
a3213105
closed
6 months ago
0
Bump gradio from 4.11.0 to 4.19.2 in /examples/web_demo
#241
dependabot[bot]
closed
6 months ago
0
[Fix] Fix the wrong output of QWEN-14B.
#240
marvin-Yu
closed
6 months ago
0
[BUG] PR #224 cause the wrong output of QWEN-14B
#239
a3213105
closed
6 months ago
0
[Example] Add seq_length in qwen fake config.ini
#238
Duyi-Wang
closed
6 months ago
0
[CMake] Remvoe force reinstall for mkl dependencies.
#237
Duyi-Wang
closed
6 months ago
0
[Kernel] Add oneDNN GPU kernels.
#236
changqi1
closed
3 months ago
0
BF16_INT4 model loading too slow
#235
intelyoungway
closed
3 months ago
1
[CMake] Open the pip-install information for MKL.
#234
marvin-Yu
closed
6 months ago
0
torch==2.2.0 run error
#233
Zjq9409
closed
6 months ago
3
[Fix] Add parameter check for logN and NTK rotary embedding of QWEN
#232
a3213105
closed
6 months ago
0
[Env] Add XFT_ENGINE env variable.
#231
changqi1
closed
6 months ago
0
[Fix] Add parameter check for logN and NTK rotary embedding of QWEN
#230
a3213105
closed
6 months ago
0
Qwen Segmentation Fault after logN PR merged.
#229
Duyi-Wang
closed
6 months ago
2
[kernel] Add ICX compiler.
#228
changqi1
closed
6 months ago
1
[Dependencies] Remove tokenizers requirement.
#227
Duyi-Wang
closed
6 months ago
0
[models][layers/tools] Refine and bugfix for baichuan models
#226
abenmao
closed
6 months ago
0
[Layer] Convert static MMHelper class to instance Class in DecoderContext.
#225
changqi1
closed
6 months ago
0
[Tools] Accelerate model loading.
#224
marvin-Yu
closed
6 months ago
2
Previous
Next