issues
search
baichuan-inc
/
Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.08k
stars
293
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
请问全参数微调所需要的最低配置是多少?
#320
richey07
closed
9 months ago
1
百川预训练时对表格数据的处理
#319
sunshineflg
closed
9 months ago
1
商用授权
#318
DSXiangLi
closed
9 months ago
2
在线量化之后保存模型遇到问题
#317
MurraryZhao
opened
9 months ago
1
压缩率是如何计算的
#316
sunshineflg
opened
9 months ago
0
RuntimeError: probability tensor contains either inf, nan or element < 0
#315
HiXiaochen
opened
9 months ago
2
微调报错
#314
Dmm2584v
opened
9 months ago
1
loss 全是0
#313
whk6688
opened
9 months ago
9
Baichuan2 为什么不在 attention 中加入 dropout ?
#312
David-Lee-1990
closed
9 months ago
1
8bit量化完后做推理的时候是不是要做反量化用FP16计算?
#311
huanyingjun
opened
9 months ago
1
如何修改tokenizer中已经存在的token
#310
muziyongshixin
opened
9 months ago
0
sdp_kernel() got an unexpected keyword argument 'enable_mem_efficient'
#309
zoe-yyx
closed
9 months ago
1
baichuan支持的是上下文为4096长度,但为什么self.max_seq_len_cached = 4096?那我想长度外推,就要使seq_len>4096,可是在我测试的时候,我将baichuan中截断的逻辑删除,并在seq_len>4096,推理结果重复发言或者直接没有?
#308
IT-five
opened
9 months ago
0
Xformers安装一直失败
#307
IT-five
closed
9 months ago
3
请问有没有垂域这种 直接训练文本text的增量预训练接口?
#306
childlong
opened
10 months ago
1
Update README.md
#305
Tyx-main
closed
8 months ago
1
对 baichuan2-13b 模型进行 ntk 插值之后,对长文本任务进行推理 generate 速度极慢
#304
LOTK2019
opened
10 months ago
3
Baichuan2-13B-chat-4bit推理时显存会暴涨,求解决方案
#303
bultiful
opened
10 months ago
12
device_map为"auto"和"cuda:0"时,model.generate_config加载结果不一致
#302
DecideToLeave
opened
10 months ago
1
各位大佬,请问fine-tune微调需要多大显存,目前采用A6000(48G)运行fine-tune.py文件,但是显示显存溢出
#301
nevesaynever1
closed
10 months ago
0
max_new_tokens 能大于 max_seq_len吗
#300
cxjtju
opened
10 months ago
0
cli_demo 在7b-chat 上多轮效果差
#299
xealml
closed
10 months ago
0
2.6 万亿 Tokens 的高质量语料是训练了 1 个epoch吗?
#298
ZayIsAllYouNeed
closed
9 months ago
1
请问我加载了一个13b的chat模型,用chat模式和generate模式推理有什么区别,为什么答案会不一样
#297
cgt-woailol
opened
10 months ago
1
请问base模型,如何进行多语言翻译,有demo示例吗?
#296
victorxst
opened
10 months ago
1
implementation of alibi mask for Baichuan2-13B-Chat
#295
bugm
opened
10 months ago
0
模型推理输出时候的函数api接口有吗,有什么参数,各参数含义是什么的文档呢
#294
starevelyn
opened
10 months ago
2
如何固定输出格式,比如json格式
#293
mine114
opened
10 months ago
1
3090预测报错:RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
#292
liwenju0
opened
10 months ago
0
解决已有issues共性问题,包括逐条推理不一致、score nan、半精度预测不一致等问题
#291
qiu404
opened
10 months ago
6
cpu32g能部署7b的模型吗........
#290
ghkl98
closed
9 months ago
1
这是因为网络问题吗
#289
ghkl98
closed
9 months ago
1
如何禁用xformers?
#288
KegangWangCCNU
opened
10 months ago
2
学习率始终不变
#287
juemifuji
closed
9 months ago
4
请教贴:循环调用时可以清除历史记忆吗?
#286
gggdroa
opened
10 months ago
5
能否提供base版的int4量化模型
#285
ethanyxfang
closed
9 months ago
1
Needs import model weight init func to run quantize.
#284
seacdr
closed
10 months ago
3
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
#283
Guodongchang
opened
10 months ago
2
【SOS】运行cli_demo.py时出现了“A matching Triton is not available”,被卡在这了……
#282
helloworld-zhangqiang
opened
10 months ago
5
Create OpenAI_api.py
#281
yuunnn-w
closed
8 months ago
1
Update requirements.txt
#280
yuunnn-w
closed
8 months ago
1
vllm加载模型生成的结果比较差,是什么原因
#279
xxSpencer
opened
10 months ago
0
Baichuan模型性能问题咨询
#278
zhaoyuguang
opened
10 months ago
1
输出的内容格式不美观,内容段落之间都没有换行,有什么办法优化吗
#277
15810644043
opened
10 months ago
1
HELP!!! 两块V100运行13B-chat推理巨慢
#276
garyyang85
closed
9 months ago
6
AttributeError: 'list' object has no attribute 'as_dict'
#275
hongxiuzhe
opened
10 months ago
4
Batch 推理和 逐条推理表现不一致
#274
zchuz
closed
9 months ago
3
百川大模型2 13B int8量化版本,在流模式并发情况下会报错:RuntimeError: cannot be multiplied (1xl and 2x15360)
#273
ycwdyy
opened
10 months ago
2
Baichuan2-13B-chat The inference speed of A800 int8 is much slower than FP16
#272
Roysky
opened
10 months ago
0
[HELP!!!]Loss值为0
#271
li-haojun
opened
10 months ago
3
Previous
Next