issues
search
yangjianxin1
/
Firefly
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
5.89k
stars
526
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
max_grad_norm不生效的问题
#304
yiyepiaoling0715
opened
2 weeks ago
1
整合LIger Kernel多卡提速20%減低顯存60% 能和Deepspeed兼容
#303
ByronHsu
opened
2 weeks ago
0
增加了minicpm3的sft、pretrain、dpo的配置参数
#302
LDLINGLINGLING
closed
1 month ago
1
qwen2 pretrain loss非常大
#301
ucaslei
closed
4 weeks ago
0
目前调用unsloth是只能支持单卡吗
#300
MinusOne2
opened
2 months ago
1
关于参数remove_unused_columns = false
#299
Kenneth0901
opened
2 months ago
0
template中的system_format字段是必须的吗?对于微调来说,这个字段是不是可以不用,从节省输入token数量的角度来说
#298
Chris-Mraz
opened
2 months ago
0
设置了双卡,却信息任然显示单卡。怎么回事?
#297
youzihaha
opened
3 months ago
0
多卡load模型OOM
#296
TonyUSTC
opened
3 months ago
0
知识蒸馏
#295
zjjznw123
opened
3 months ago
0
是否支持linux平台
#294
Scorponok31
closed
3 months ago
0
loss一直为0
#293
chk4991
opened
3 months ago
1
用qwen1.5 infer的时候他就疯狂写代码
#292
ExeCuteRunrunrun
opened
3 months ago
0
有支持华为昇腾npu的计划吗
#291
bestRiven
opened
3 months ago
0
如何在训练语料中添加system
#290
louyuanyuan053709
opened
4 months ago
1
llama 3.1, DCLM支持
#288
WinterStraw
opened
4 months ago
0
请问有支持GLM4-9B-Chat SFT 的计划吗?
#287
Candy555
opened
4 months ago
0
category 这个字段在推理的时候怎么用
#286
louyuanyuan053709
opened
4 months ago
0
modifiy cache path rule for pretrain datasets to fix bug
#285
ba5bo
opened
4 months ago
0
有支持internvl 的计划么?
#284
moyans
opened
4 months ago
0
请问我运行“python train.py --train_args_file train_args/sft/qlora/baichuan-7b-sft-qlora.json”指令,为什么报一下错误呀?
#283
lh5533223
opened
4 months ago
1
关于蒸馏训练的代码
#282
rattlesnakey
opened
4 months ago
4
训练baichuan-13B qlora, 发现loss查看为空,怎么解决
#281
lovegit2021
opened
4 months ago
0
目前unsloth是不是不支持qwen2-moe结构?
#280
xiaoer2498
opened
4 months ago
0
微调Qwen2-1.5B-Instruct,loss始终是0
#279
frederichen01
opened
4 months ago
7
DPO的训练能否像SFT一样mask掉user,只计算assistant的loss
#278
charliedream1
opened
4 months ago
5
现在是否支持微调后大模型的单机多卡推理。
#277
SunRise-Star
opened
4 months ago
0
有计划支持deepseek-coder-v2-lite的微调吗
#276
fengyang95
opened
4 months ago
0
可以支持 llava 微调吗?
#275
vincent507cpu
opened
4 months ago
0
expected string or bytes-like object
#274
sankexin
opened
5 months ago
0
qwen2-7b lora loss为0
#273
fengyang95
closed
5 months ago
1
Qwen2-7B-Instruct 训练loss 0 或推理 probability tensor contains either `inf`, `nan` or element < 0
#272
WinterStraw
closed
5 months ago
6
请问支持codestral微调和部署吗?
#271
SunRise-Star
opened
5 months ago
0
ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.
#270
WeixuanXiong
opened
5 months ago
1
Qwen2-7b lora
#269
linzm1007
opened
5 months ago
0
add Qwen2
#268
yangjianxin1
closed
5 months ago
0
为什么llama2的template中没有bos_token
#267
qy1026
opened
5 months ago
0
使用Unsloth反而OOM了是为什么呢?
#266
yuyu990116
closed
5 months ago
2
NameError: name 'FastLanguageModel' is not defined
#265
Jackiexiong
opened
5 months ago
0
“max_seq_length:训练时的最大长度。按照自己的设备进行设置,越长需要占用越多显存。”这个有换算规则吗?
#264
sunzx8
opened
6 months ago
0
unsloth不支持多机多卡,而且对比的还是没有开启flashattn2的增益,有些许疑问
#263
MonolithFoundation
opened
6 months ago
0
DPO支持多卡的QLora训练吗?
#262
q497629642
opened
6 months ago
0
启动chat.py脚本加载merge_lora生成的checkpoint模型或者adapter参数都会报错
#261
2013303386
opened
6 months ago
0
add yi-1.5
#260
yangjianxin1
closed
6 months ago
0
RuntimeError: FlashAttention only support fp16 and bf16 data type
#259
sankexin
opened
6 months ago
1
NotImplementedError: No operator found for `memory_efficient_attention_forward`
#258
sankexin
opened
6 months ago
0
kto训练实现
#257
vincezengqiang
opened
6 months ago
0
训练llama3-8b-it报错
#256
wx971025
opened
6 months ago
2
deepspeed安装报错
#255
TC10127
opened
6 months ago
1
多轮对话训练效果
#254
chanel111
opened
6 months ago
1
Next