issues
search
ztxz16
/
fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
Apache License 2.0
3.2k
stars
322
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
可以在CMake中主动指定CUDA计算能力
#475
fluxlinkage
opened
21 hours ago
1
支持将直接读取safetrensors得到的模型存为flm格式,并加载推理
#474
TylunasLi
closed
1 day ago
1
对于int4g模型,增加对fp16输入的支持
#473
jiewlmrh
closed
5 days ago
0
对于int8模型,增加对fp16输入的支持
#472
jiewlmrh
closed
1 week ago
0
OSError: libcublas.so.ll: cannot open shared odject file: No such file or directory
#471
lichengyang666
opened
1 week ago
1
Meta-Llama-3-70B-Instruct
#470
longcheng183
opened
1 week ago
5
直接读取Llama3,Qwen2的HF模型,apiserver webui benchmark使用ChatTemplate
#469
TylunasLi
closed
2 weeks ago
0
修复__restrict__导致的restrict is not allowed错误
#468
ColorfulDick
closed
3 weeks ago
1
支持转换glm4-9b-chat模型
#467
TylunasLi
closed
3 weeks ago
0
修复webui/apiserver的Windows编译,并支持直接读取HF模型
#466
TylunasLi
closed
3 weeks ago
0
GLM-4-6B-Chat转换成flm格式后不能加载
#465
HofNature
closed
1 week ago
5
修复Windows下的编译
#464
TylunasLi
closed
4 weeks ago
0
H800 docker 编译, half类型转换 编译报错
#463
ShadowTeamCN
closed
1 month ago
1
请问什么时候支持GLM-4 ?
#462
Stupid-Ai
closed
1 week ago
4
qwen1.5 int4模型回复出现解码问题:UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 72-73: invalid continuation byte
#461
zhang415
opened
1 month ago
0
Added English Translation of Readme
#460
Wheylop
closed
1 month ago
0
make -j过程中报错
#459
AIlaowong
opened
1 month ago
3
添加add_special_tokens选项,默认true,支持chatglm
#458
levinxo
closed
1 month ago
1
请问现在支持deepseekv2量化吗
#457
fw2325
closed
1 month ago
1
Revert "添加add_special_tokens选项,默认true,支持chatglm模型"
#456
ztxz16
closed
1 month ago
0
解决,arm64 windows下编译报错
#455
dignfei
closed
1 month ago
0
[CMakeFiles/Makefile2:100: CMakeFiles/pyfastllm.dir/all]
#454
ttaop
opened
2 months ago
0
提供一个兼容OpenAI 接口的http server
#453
MistSun-Chen
closed
2 months ago
0
结果返回一直是<unk>
#452
VincentLore
opened
2 months ago
1
添加add_special_tokens选项,默认true,支持chatglm模型
#451
levinxo
closed
1 month ago
1
chatglm3 相同提示词生成结果一致
#450
ttaop
opened
2 months ago
0
采用向量化访存优化旧架构GPU性能
#449
TylunasLi
closed
2 months ago
0
Do you have a plan to implement the CudaCatOp?
#448
dp-aixball
opened
2 months ago
0
中文输入无法识别;webui打开的地址无法访问。
#447
Mihubaba
closed
2 months ago
1
千问qwen1.5-14B-chat解码错误
#446
yiguanxian
opened
2 months ago
2
cmake -j报错
#445
gggdroa
opened
2 months ago
2
修复NVIDIA旧架构GPU编译问题,初步优化旧架构GPU性能
#444
TylunasLi
closed
2 months ago
0
无法安装fastllm_pytools
#443
bailingchun
opened
2 months ago
1
Llama支持分组查询注意力,支持书生2模型
#442
TylunasLi
closed
3 months ago
0
添加QT界面Qui到example
#441
jacques-chen
closed
3 months ago
0
流式输出中断问题
#440
lwinhong
opened
3 months ago
0
修复Windows下编译的几个问题
#439
TylunasLi
closed
3 months ago
0
修复Win32Demo CPU构建错误
#438
TylunasLi
closed
3 months ago
1
模型转换的时候是不是不能用量化过的模型
#437
shum-elli
opened
3 months ago
1
是否支持qwen1.5的滑动窗口的方式
#436
aofengdaxia
opened
3 months ago
0
大佬您好,这个性能和chatglm.cpp比起来,会更好吗
#435
ericjing83
opened
3 months ago
0
fastllm是否支持使用bitsandbytes量化的chatglm3-6b-base int4模型
#434
levinxo
opened
3 months ago
0
Error: cublas error during MatMul in Attention operator.
#433
pingyuan2016
closed
3 months ago
3
/api/chat_stream The result returned by postman is empty
#432
Dong09
opened
4 months ago
0
添加python Tensor级api
#431
wildkid1024
closed
3 months ago
0
fix docker build error, update cmake version and base image
#430
peter4431
closed
3 months ago
0
ResponseBatch 返回结果不正确
#429
Liufeiran123
opened
4 months ago
5
MiniCPM模型Win32Demo工程编译、GPU执行问题修复
#428
TylunasLi
closed
4 months ago
0
batch padding mask 处理的相关代码
#427
Liufeiran123
closed
4 months ago
0
support export minicpm-2b-float16.flm
#426
hadoop2xu
closed
4 months ago
0
Next