issues
search
jingyaogong
/
minimind
「大模型」3小时完全从0训练26M的小参数GPT,个人显卡即可推理训练!
https://jingyaogong.github.io/minimind
Apache License 2.0
2.7k
stars
329
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update 5-dpo_train.py
#90
leoz9
opened
2 days ago
0
这是分词器不行吗,是不是得去换别的
#89
Enter10000
closed
5 days ago
1
请问Lora比全量微调更耗时是什么原因,正常吗?
#88
srconly
closed
5 days ago
1
我有个比较粗浅疑问:PT、SFT、DPO这些不同的训练方法在本质上是一样的?
#87
chuanzhubin
closed
5 days ago
1
修正了训练tokenizer中的chat_template中的逻辑,以及修正了tokenizer_config.json相应部分
#86
Singularity-M
opened
1 week ago
0
5-dpo_train运行报错,是版本问题吗?
#85
srconly
closed
1 week ago
2
请教如何优化内存使用
#84
ltc0
closed
1 week ago
2
请问现有的模型是否支持英文呢?
#83
chongkuiqi
closed
1 week ago
4
请教Pretrian中断后继续训练的疑问
#82
VictorSun1996
closed
1 week ago
2
Padding Mask部分的疑惑
#81
skygreygrey
closed
5 days ago
4
求助,sft数据处理完后是不会使用分词器转码的吗?
#80
Enter10000
closed
1 week ago
2
想问下pretrain_data(24-09-27后弃用).bin 需要下载吗? 谢谢
#79
chenzk1993
closed
2 weeks ago
1
不算是问题,简单讨论下pretrain eval的效果
#78
cpp2016
closed
2 weeks ago
1
Update requirements.txt
#77
LIE624
opened
2 weeks ago
0
3的3次方都回答不了?
#76
bozaigao
closed
2 weeks ago
1
Minimind的推理过程学习记录
#75
RyanSunn
opened
3 weeks ago
2
执行python data_process.py报错
#74
frozencoolcool
closed
2 weeks ago
1
学习代码的时候写了一个教程,希望可以帮到其他同学
#73
pengqianhan
opened
3 weeks ago
1
sft_data.csv数据生成报错
#72
liyu98
closed
2 weeks ago
2
Transformer结构问题
#71
RyanSunn
closed
2 weeks ago
2
English support
#70
deweihu96
closed
2 weeks ago
2
模型的输入输出长度
#69
cqcracked
closed
4 weeks ago
3
结果复现效果问题
#68
lesterlee89
closed
2 weeks ago
10
768爆显存
#67
yhl41001
closed
4 weeks ago
4
请问如何微调不同size的模型
#66
luckyfan-cs
closed
4 weeks ago
3
DPO数据处理为空
#65
lesterlee89
closed
1 month ago
2
梯度低秩投影集成
#64
ningpengtao-coder
closed
1 month ago
2
fix 5-dpo_train.py bugs
#63
StudyingLover
closed
1 month ago
3
此項目能繼續預訓練嗎?
#62
win10ogod
closed
4 weeks ago
3
hidden_dim = 4 * dim 设置原因
#61
pengqianhan
closed
4 weeks ago
1
讨论个人GPU的训练时间
#60
jingyaogong
opened
1 month ago
4
Auto tokenizer name path fix
#59
krmst
opened
1 month ago
0
小哥哥再把量化加入进去,整个生命周期就完整了
#58
cqcracked
closed
1 month ago
1
5-dpo_train.py 问题
#57
StudyingLover
closed
1 month ago
1
如何微调用于下游任务?
#56
h2h2h
closed
4 weeks ago
4
关于在MAC上执行的一些发现,非issue
#55
krmst
opened
1 month ago
7
Update requirements
#54
krmst
opened
1 month ago
0
5-dpo_train.py的问题
#53
cqcracked
closed
1 month ago
1
脚本设置DDP失败
#52
FangKQ
closed
1 week ago
2
1-pretrain中新增的loss mask与现有F.cross_entropy参数不匹配问题
#51
lesterlee89
closed
1 month ago
1
使用google colab方式训练测试笔记本(不是issue)
#50
xxx1099836595
opened
1 month ago
1
多机多卡集群训练是否支持?
#49
jerry1993-tech
closed
4 weeks ago
2
CUDA_HOME does not exist, unable to compile CUDA op(s)
#48
ozbillwang
closed
4 weeks ago
12
moe模型是否更难训
#47
WangRongsheng
closed
1 month ago
4
邮件回复
#46
iomgaa-ycz
closed
1 month ago
2
原1-pretrain.py的dtype改成b或者bfloat1666666也能正常训练呢
#45
cqcracked
closed
1 month ago
4
修复wandb bug & 添加了argparse
#44
iomgaa-ycz
closed
1 month ago
3
添加了wandb
#43
iomgaa-ycz
closed
1 month ago
0
修复了data_process.py文件的bug
#42
iomgaa-ycz
closed
1 month ago
0
Seq-Monkey的总量10B是如何计算的啊
#41
CarryHJR
closed
1 month ago
2
Next