issues
search
DLLXW
/
baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
MIT License
2.44k
stars
300
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
关于分词器处理后的预训练语料是通过哪个代码生成的
#84
livevivaer
opened
5 days ago
0
想问一下 为什么做数据清洗时保存数据为Parquet格式,后面做分词时候还是用的json
#83
yangwenche
opened
2 weeks ago
0
ChatGLMTokenizer类
#82
licx102359
opened
2 weeks ago
2
预训练输入最后的切片不会导致模型的输入少一个长度吗?
#81
AI-Study-Han
opened
1 month ago
0
模型的回答较长,输出结果不完整要怎么解决
#80
MSJeinlong
opened
2 months ago
0
smallvocab tokenizer
#79
iangellove
opened
2 months ago
0
请问语言模型的强化学习有可以参考的开源项目吗?
#78
AI-Study-Han
opened
2 months ago
1
请问大数据量怎么加载呢?
#77
CaesarGo
opened
3 months ago
0
请问哪步加的 Positional embeddings
#76
buhe
closed
3 months ago
1
chatglm_tokenizer 模块是在哪个软件包中?
#75
PANASV
opened
3 months ago
2
预训练阶段,每条训练样本混杂着不同的句子(不同句子用<eos>隔开)
#74
Itochiee
opened
4 months ago
0
请问在处理微调数据集时为何要限制文本长度?
#73
jzzzf
opened
4 months ago
1
作者,这个项目支持断点续训嘛
#72
1737686924
opened
4 months ago
2
请问支持tensorrt llm部署吗
#71
Ss-shuang123
opened
4 months ago
0
交个作业吧
#70
yasohasakii
closed
4 months ago
0
proces single file in foreach,avoid oom
#69
maoxiangyi
opened
4 months ago
0
预训练模型参数和eval参数维度不匹配的问题
#68
1019245175
opened
4 months ago
0
c4-zh数据有问题
#67
yasohasakii
closed
4 months ago
3
关于运行一段时间,机器断电,如何继续训练
#66
GromZhang
opened
5 months ago
2
fix: Fix attribute error and reduce memory usage during data processing
#65
noahc1510
opened
5 months ago
0
请问单卡16G显存的4060Ti能训练吗?
#64
XiaoluJiayou
closed
5 months ago
1
Problem with tokenizer?
#63
shokhjakhonone
opened
6 months ago
3
请问下这个报错是哪里配置的不对吗?
#62
beginner-wj
opened
6 months ago
0
请问下这个报错是什么信息?
#61
beginner-wj
closed
6 months ago
0
为了丰富和扩充本项目,这里开源了使用deepspeed进行训练的代码和权重(1.75B)
#60
AI-Study-Han
closed
5 months ago
0
Ignore the `freqs_cis` buffer so that DDP does not broadcast it at construction time
#59
xiaoguzai
opened
7 months ago
0
跑训练报错
#58
singeleaf
closed
5 months ago
3
自己用
#57
life-peace
closed
7 months ago
0
fix: multi gpu ddp save error
#56
billvsme
closed
8 months ago
0
配置优化器的部分为什么,大于或等于2D的参数会被衰减,小于2D不会衰减?
#55
zerozhoujie
closed
7 months ago
2
/track1/train_valid.json
#54
cj401
closed
7 months ago
1
如何修改,支持4k上下文,以及16k上下文呢?
#53
937739823
closed
7 months ago
1
交个作业
#52
ljg-lixufeng
closed
7 months ago
0
提示:在训练中加入complie = True后再sft中也需要同步,不然会造成模型载入错误
#51
Hong-Shuo
closed
7 months ago
0
Attention!! 推理代码里面的致命笔误是导致大家看到效果不好的原因。望周知!
#50
DLLXW
closed
5 months ago
0
多个节点多卡的pretrain
#49
lixin716
closed
7 months ago
2
修改了词表大小后与预训练模型的维度不匹配,大家怎么处理的呀
#48
ghost
opened
9 months ago
0
模型效果
#47
AI-Study-Han
closed
5 months ago
3
没有找到此文件
#46
servlet1111
closed
7 months ago
0
transformers最新版本会报错
#45
somewordstoolate
opened
9 months ago
2
模型参数量计算
#44
zxx20231119
opened
10 months ago
2
前期数据处理差异
#43
wujianqiangwjq
closed
10 months ago
1
一个很诡异的错误 IndexError: index 35930 is out of bounds for axis 1 with size 2048
#42
zhaodice
closed
10 months ago
1
第一轮练完了正在跑第二轮。不能能加个脚本转格式能让obabooga使用?
#41
limao999666
opened
11 months ago
0
为什么预训练时,做attention的时候不需要mask
#40
LLH1818
closed
11 months ago
0
想问下训练的数据和epoch数
#39
YuzhouPeng
opened
11 months ago
4
eos token是空字符串
#38
Destiny-Lu
closed
11 months ago
4
进行多卡pretrain的时候,出现了如下异常
#37
GromZhang
opened
11 months ago
7
请问是从头开始预训练的,为什么在项目中体现到了llama2,初学者不太理解
#36
GromZhang
closed
11 months ago
1
总结下几个问题
#35
Vincent-ZHQ
opened
12 months ago
3
Next