intel xFasterTransformer issues

intel / xFasterTransformer

Apache License 2.0

270 stars 53 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add --padding and fix bug

#418 yangkunx closed 1 month ago
0
[Kernel] Add oneDNN AMX_FP16 compute kernels.

#417 changqi1 closed 1 month ago
1
[Dependency] Update torch to 2.3.0.

#416 Duyi-Wang closed 1 month ago
0
[Kernel] Add FP16 MHA and MLP kernels.

#415 changqi1 closed 1 month ago
0
Can we summarize the meanings of data type like bf16_fp16?, for example, what's activation data type and output data type, what's the computing instruction?

#414 heagoo opened 1 month ago
1
[output issue] found mistakes in llama-3-70b output by bf16_int4 during benchmark

#413 intelyoungway opened 1 month ago
1
[Kenrel] Add FP16 LLaMA YARN rotary_embedding.

#412 changqi1 closed 1 month ago
1
qwen1.5-32b long text input issue

#411 zhm-algo closed 2 weeks ago
5
[xDNN] Release v1.5.0.

#410 changqi1 closed 1 month ago
0
[Benchmark] Add platform options. Support real model.

#409 JunxiChhen closed 1 month ago
1
[Kernel] Add FP16 rmsnorm and rope kernels.

#408 changqi1 closed 1 month ago
0
[Models/Layers/Kernels] Add Baichuan1/2 full-link bf16 support & Fix next-tok gen bug

#407 abenmao closed 1 month ago
0
[Example] Fix incorrect tensor dimension with latest interface

#406 pujiang2018 closed 1 month ago
0
[Bug] fix incorrect input offset computing

#405 pujiang2018 closed 1 month ago
0
[Interface] Support List[int] and List[List[int]] for set_input_sb.

#404 Duyi-Wang closed 1 month ago
0
llama-2-70b memory usage

#403 zhm-algo closed 1 month ago
2
[Layers] Add alibiSlopes Attn && Flash Attn for CB.

#402 abenmao closed 1 month ago
0
[Example] Add demo of offline continuous batching

#401 pujiang2018 closed 1 month ago
0
[Interface] Change return shape of forward_cb.

#400 Duyi-Wang closed 1 month ago
0
[KVCache] Remove FP32 data type.

#399 Duyi-Wang closed 1 month ago
0
[Interface] Add python api for continuous batching.

#398 Duyi-Wang closed 1 month ago
0
[Layer] Better method to reinterpret KV cache

#397 pujiang2018 closed 1 month ago
0
[API] Optimize API Impl.

#396 changqi1 closed 1 month ago
0
What's the meaning of bf16_int4 datatype?

#395 LeiZhou-97 closed 1 month ago
4
[Model] Check maxLen should be [input len, model max len].

#394 Duyi-Wang closed 1 month ago
0
[Example] More check in C++ continuous batching example

#393 pujiang2018 closed 1 month ago
0
[Example] Fix continuous batching C++ example.

#392 Duyi-Wang closed 1 month ago
0
[Bug] Fix incorrect buffer size calculation

#391 pujiang2018 closed 1 month ago
0
[Example] add cb_check example

#390 pujiang2018 closed 1 month ago
0
[Model][Layer] Correct output of the new forward

#389 pujiang2018 closed 1 month ago
0
[Build] Fix namespace build issue.

#388 Duyi-Wang closed 1 month ago
0
[Common] DecoderContext::resize bug fix

#387 pujiang2018 closed 1 month ago
0
[API] Add LLaMA decoder API.

#386 changqi1 closed 1 month ago
3
[Demo] Add abbreviation for output length.

#385 Duyi-Wang closed 1 month ago
0
[Layer] update mlp for CB.

#384 marvin-Yu closed 1 month ago
0
[Layers] Added RotaryEmbedding forward for bc mode & Fixed rope ut

#383 abenmao closed 1 month ago
0
[Layer] Cross attention impl. for CB

#382 pujiang2018 closed 1 month ago
0
[Framework] Update set_input for cb.

#381 Duyi-Wang closed 1 month ago
0
llama-2-7B benchmarking error with chinese prompts

#380 qdym188 closed 1 month ago
1
[Framework] Code fix to make new path for CB work

#379 pujiang2018 closed 2 months ago
0
[API] Add LLaMA attention API.

#378 changqi1 closed 1 month ago
1
[Model] Return seqIDs when set input.

#377 Duyi-Wang closed 2 months ago
0
[Sampling] Add greedy search for cb path.

#376 Duyi-Wang closed 2 months ago
0
[Model/Layer] New forward to support CB (CommonDecoder->DecoderBlock->DecoderLayer->Attention/MLP)

#375 pujiang2018 closed 2 months ago
1
[CMake] Remove evaluation under XFT_BUILD_TESTS option.

#374 Duyi-Wang closed 2 months ago
0
[Sampling] Add repetition penalty for new seq type.

#373 Duyi-Wang closed 2 months ago
0
[Kernel] Add GPU kernels and enable LLaMA model.

#372 changqi1 closed 3 weeks ago
0
[Common] New KVCacheMgr to support CB

#371 pujiang2018 closed 2 months ago
1
[Model] Fix ICX build issue.

#370 changqi1 closed 2 months ago
0
[Model] New CommonDecoder::forward impl. skeleton

#369 pujiang2018 closed 2 months ago
2

Previous Next