issues
search
charent
/
ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
Apache License 2.0
1.12k
stars
132
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
tokenizer训练OOM 。内存60G
#59
musexiaoluo
opened
1 week ago
0
数据清洗代码
#58
Mrkkew
opened
1 week ago
0
大佬,能不能分享一下清洗后的数据集呀,loss一直在4.0下不来
#57
KTVICTORY18
opened
1 month ago
0
运行3.4python ptr_train.py时报错OSError: Can't load tokenizer for 'D:/pycharmenv/ChatLM-mini-Chinese/model_save/'.
#56
summerFF
opened
1 month ago
1
3.4预训练运行出现 unsupported operand type(s) 错误,求帮忙
#55
summerFF
opened
1 month ago
0
4080显卡,基本跑不了多少数据,过万条训练数据就报错
#54
iissy
opened
1 month ago
4
tokenizer的字典中有不少token带有下划线,请问这种是什么意思
#53
Mactarvish
closed
1 month ago
2
可以用a卡训练吗
#52
alexhan1012
closed
1 month ago
1
预训练,用了160万数据,共2G句子对,使用A40的48G显存,无论使用1/2/3/4卡,都会报OOM
#51
JaymzWang
closed
1 month ago
1
这种只能通过问答对的方式,有没有办法MLM的方式学习知识体系。
#50
BShark-YB
closed
1 month ago
1
是否考虑将预训练的模型和仅stf后的模型也上传的平台呢
#49
seal-wang
closed
1 month ago
1
sft_train
#48
dbcSep03
closed
1 month ago
1
Some NCCL operations have failed or timed out.
#47
dbcSep03
opened
4 months ago
5
预训练数据集必须是{“prompt”: "response":}的格式么?
#46
dbcSep03
closed
4 months ago
2
非常不错的开源项目
#45
DataXujing
closed
4 months ago
1
请问这些预训练数据加起来有多少token呀
#44
StarCycle
closed
4 months ago
2
这个模型好像没有长文对话的能力,该如何训练它让它有这个能力?
#43
Liuxinhao12
closed
4 months ago
1
train_3.5M_CN数据处理问题
#42
wflying000
closed
1 month ago
1
如何加载sft后的模型?
#41
Liuxinhao12
closed
4 months ago
1
RuntimeError: No executable batch size found, reached zero
#40
suiyueyousan
closed
5 months ago
2
考虑出一个支持llama的版本吗
#39
leondada
closed
5 months ago
1
如何提取中间层的输出?
#38
W-void
closed
5 months ago
2
sft微调时报错
#37
ama0zarashi
closed
5 months ago
4
用train.py出现shape的mismatch
#36
huluk98
closed
4 months ago
10
微调后预测三元组不正确原因
#35
qiutzh
closed
5 months ago
5
预训练数据集
#34
rabintang
closed
5 months ago
2
项目怎么使用fastchat 进行调试
#33
zhilangtaosha
closed
6 months ago
1
Great Work! Does it support multimodal ability?
#32
LianghuiGuo
closed
5 months ago
1
运行·pre_train报错,TypeError: Accelerator.__init__() got an unexpected keyword argument 'use_seedable_sampler'
#31
JaymzWang
closed
5 months ago
1
请问数据预处理里面bell_open_source/train_0.8M_CN.json是在哪里下载的呀
#30
PshySimon
closed
6 months ago
7
请问,如果有新的内容需要添加,是否需要全部重新训练?
#29
kideve
closed
5 months ago
2
Bump fastapi from 0.105.0 to 0.109.1
#28
dependabot[bot]
closed
7 months ago
0
多卡情况下,同一份数据集会加载多次吗
#27
shinerdeng
closed
7 months ago
6
大佬请教一下,只做中文RAG的话,这个跟你另外一个phi,哪个效果比较好?
#26
xianzhisheng
closed
5 months ago
1
请教“3.3 Tokenizer训练”如何运行?
#25
ybdesire
closed
7 months ago
2
Why do I get stuck loading the dataset after running it
#24
anyiz
closed
5 months ago
11
在 SFT 微调途中出现报错
#23
aoguai
closed
7 months ago
11
有考虑将模型分发的https://modelscope.cn/么?
#22
qmjy
closed
7 months ago
2
使用Lora 和 sft_train.py 训练效果好像没有,有没有好的方法?
#21
yugu91
closed
5 months ago
7
readme可以提供下封装了环境加模型的docker镜像吗?
#20
zack-sys
closed
7 months ago
1
是否有计划针对agent函数调用微调
#19
lucasjinreal
closed
1 month ago
4
如果在更好的设备上训练效果区别大吗
#18
aiwillcoming
closed
7 months ago
1
请教一个问题,生成的回复重复
#17
shinerdeng
closed
7 months ago
2
為甚麼我啟動API會出現這個
#16
Adolph3671
closed
7 months ago
1
Hello, 第一次使用,请问运行时出现 unsupported operand type(s) for |: 'types.GenericAlias' and 'type' 是什么问题?
#15
yugu91
closed
7 months ago
2
是否可以在服务器上运行?
#14
yanyilin3344
closed
7 months ago
5
基于提供的模型进行sft报错
#13
cq1316
closed
7 months ago
13
清洗好的数据集会开源吗?
#12
echo-valor
closed
7 months ago
1
Dev
#11
charent
closed
8 months ago
0
如何运行呢?
#10
meng25meng
closed
7 months ago
17
Next