intel xFasterTransformer issues

intel / xFasterTransformer

Apache License 2.0

348 stars 60 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[model] Add deepseek model.

#274 marvin-Yu closed 5 months ago
0
Issue qwen72b seq length

#273 a3213105 closed 5 months ago
1
[Common] Unify memory allocation into xft::alloc

#272 pujiang2018 closed 5 months ago
0
[Include] Fix include not work.

#271 Duyi-Wang closed 6 months ago
0
[include] Organize include file

#270 wenhuanh closed 6 months ago
0
4 sockets qwen execute question

#269 Storm0921 closed 5 months ago
13
[Kernel] increase parallelism for KV cache copy in self attention

#268 pujiang2018 closed 6 months ago
0
[Kernel] Fix the incorrect computing which should be in float, but was in integer

#267 pujiang2018 closed 6 months ago
0
[Layer] Reduce repeated sin and cos embedding table data to optimize ROPE perf.

#266 changqi1 closed 6 months ago
0
[Layer] Use flash attention when larger than threshold ('>=' to '>')

#265 pujiang2018 closed 6 months ago
0
gpu kernel: rms_norm

#264 aurora327 closed 6 months ago
0
[Benchmark] Modify CPU affinity logic, add CI prompt output.

#263 marvin-Yu closed 6 months ago
0
[Version] v1.4.0.

#262 Duyi-Wang closed 6 months ago
0
[Benchmark] Fix typo in benchmark script.

#261 Duyi-Wang closed 6 months ago
0
[Search] Sync smaple result in multi-rank.

#260 Duyi-Wang closed 6 months ago
0
[Model] Add gemma model support.

#259 marvin-Yu closed 5 months ago
3
[Attention Kernel/Layer] group attention support in full-link BF16 path; attention layer refactor

#258 pujiang2018 closed 6 months ago
0
[Benchmark] Update model cfg for transformers>4.36.

#257 Duyi-Wang closed 6 months ago
0
typo in benchmark script

#255 sssssux closed 6 months ago
1
[Serving] Fix fail to set pad_token_id when it's not None in single mode.

#254 Duyi-Wang closed 6 months ago
0
[Kernel] Add oneDNN GPU kernels.

#253 changqi1 closed 3 months ago
0
[layers] Add bf16-type input/output support for flash attention

#252 abenmao closed 6 months ago
0
Fix Opt issue

#251 xiangzez closed 6 months ago
0
support vllm?

#250 leiwen83 closed 3 months ago
2
[Fix] Fall back to float to bypass issues with MQA/GQA.

#249 marvin-Yu closed 6 months ago
1
[Docs] Initial documents.

#248 Duyi-Wang closed 6 months ago
0
Illegal instruction (core dumped)

#247 wswsmao closed 6 months ago
5
[Dependency] Update web demo requirement.

#246 Duyi-Wang closed 6 months ago
0
[Kernel] Set USE_AMX_M to 1.

#245 Duyi-Wang closed 6 months ago
0
Error install xfastertransformer on CentOS 7.6

#244 bin1guo closed 6 months ago
1
[Example] Add llama2 chat support in Cli demo.

#243 Duyi-Wang closed 6 months ago
0
fix issue #220

#242 a3213105 closed 6 months ago
0
Bump gradio from 4.11.0 to 4.19.2 in /examples/web_demo

#241 dependabot[bot] closed 6 months ago
0
[Fix] Fix the wrong output of QWEN-14B.

#240 marvin-Yu closed 6 months ago
0
[BUG] PR #224 cause the wrong output of QWEN-14B

#239 a3213105 closed 6 months ago
0
[Example] Add seq_length in qwen fake config.ini

#238 Duyi-Wang closed 6 months ago
0
[CMake] Remvoe force reinstall for mkl dependencies.

#237 Duyi-Wang closed 6 months ago
0
[Kernel] Add oneDNN GPU kernels.

#236 changqi1 closed 3 months ago
0
BF16_INT4 model loading too slow

#235 intelyoungway closed 3 months ago
1
[CMake] Open the pip-install information for MKL.

#234 marvin-Yu closed 6 months ago
0
torch==2.2.0 run error

#233 Zjq9409 closed 6 months ago
3
[Fix] Add parameter check for logN and NTK rotary embedding of QWEN

#232 a3213105 closed 6 months ago
0
[Env] Add XFT_ENGINE env variable.

#231 changqi1 closed 6 months ago
0
[Fix] Add parameter check for logN and NTK rotary embedding of QWEN

#230 a3213105 closed 6 months ago
0
Qwen Segmentation Fault after logN PR merged.

#229 Duyi-Wang closed 6 months ago
2
[kernel] Add ICX compiler.

#228 changqi1 closed 6 months ago
1
[Dependencies] Remove tokenizers requirement.

#227 Duyi-Wang closed 6 months ago
0
[models][layers/tools] Refine and bugfix for baichuan models

#226 abenmao closed 6 months ago
0
[Layer] Convert static MMHelper class to instance Class in DecoderContext.

#225 changqi1 closed 6 months ago
0
[Tools] Accelerate model loading.

#224 marvin-Yu closed 6 months ago
2

Previous Next