issues
search
shibing624
/
MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
3.24k
stars
492
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
RuntimeError: CUDA error: device-side assert triggered. indexSelectLargeIndex: block: [58,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
#275
yyz-selfiie
closed
10 months ago
4
小白请问,在pretraining.py运行中想手动设置在哪几张GPU上运行要在哪里修改呢?
#274
JimberZ
closed
10 months ago
0
增量预训练,Lora模型和原模型(baichuan2)融合后,推理时间变得很长,大概要好几分钟才会出结果,但是原来的baichuan2只需要几秒钟
#273
sweetboxwwy
closed
10 months ago
1
关于ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.") ValueError: weight is on the meta device, we need a `value` to put in on cpu.错误问题
#272
waycup7
closed
8 months ago
1
Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for PeftModelForSequenceClassification
#271
waycup7
closed
10 months ago
1
需要进行模型评测吗?
#270
chenkang404
closed
1 month ago
2
sft的时候加入shift_attn 窗口长度可以增加多少
#269
sunshineyg2018
closed
10 months ago
0
【数据集】Alpaca 和 Vicuna 两个 Template 之间有什么不同
#268
SoYuCry
closed
10 months ago
3
使用SFT后的模型推理时出现报错,麻烦答主帮帮忙看下!
#267
SoYuCry
closed
10 months ago
1
chatglm2预训练需要多少显存,4*44G都会报oom
#266
xpcc355
closed
10 months ago
1
[2023-11-20 19:18:31,206] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2061498) of binary
#265
dshwei
closed
10 months ago
4
多轮对话SFT完了后测试会出现回复重复句子的现象
#264
chloefresh
closed
1 month ago
2
用baichuan2-7B训练reward模型loss很快到0
#263
a101269
closed
10 months ago
4
baichuan2的chat template
#262
w5688414
closed
7 months ago
1
关于增量预训练
#261
tszslovewanpu
closed
1 month ago
2
大佬能帮我看看loss吗,Train 的loss一直在下降,eval的loss 触底猛反弹
#260
SoYuCry
closed
10 months ago
1
reward model合并出错
#259
dogrepairditch
closed
10 months ago
2
在convert_dataset.py文件中总是单个文件处理!
#258
tuqingwen
closed
1 month ago
1
关于推理的input
#257
tszslovewanpu
closed
11 months ago
5
在DPO训练时程序运行一半时突然直接中断
#256
tuqingwen
closed
1 month ago
1
关于merge模型的格式问题
#255
tszslovewanpu
closed
1 month ago
2
全量sft保存权重后,训练不继续
#254
nuoma
closed
11 months ago
2
在sft的过程中,保存的多个checkpoint,这些checkponit是否可以支持merge?
#253
rainfallLLF
closed
11 months ago
3
SFT 时,loss 迅速变为 0
#252
SoYuCry
closed
11 months ago
10
RL强化学习训练后,模型合并时报错,前面的步骤完全按pipeline命令,
#251
PICOPON
closed
10 months ago
1
求助:2机8卡训练SFT时卡住
#250
mymong
closed
10 months ago
3
PT和SFT之后,使用SFT模型预测报错:RuntimeError: probability tensor contains either inf, nan or element < 0
#249
dage0127
closed
11 months ago
2
有没有支持mistral 7b的计划
#248
xbeark
opened
11 months ago
3
baichuan lora模型的数据配比
#247
xxcoco763
closed
10 months ago
4
1.6.0版本中的flash_attn参数指定之后运行报错”Some specified arguments are not used by the HfArgumentParser: {remaining_args}“
#246
CHAOJICHENG5
closed
11 months ago
3
请问是否支持chatglm3-6b? enhancement
#245
hongyix
closed
11 months ago
2
数据集的划分问题,是否需要手动划分 train 和 val
#244
SoYuCry
closed
11 months ago
4
指定--use_peft False 为全参训练(要移除 --fp16 ),需要移除 --torch dtype bfloat16 吗
#243
SoYuCry
closed
11 months ago
6
dataset bug help
#242
YJSoooooo
closed
11 months ago
7
dataset bug help
#241
YJSoooooo
closed
11 months ago
0
求教:双机双卡(24G A5000)训练baichuan-13B-chat报错,代码是supervised_finetuning.py
#240
sweetboxwwy
closed
7 months ago
2
求教!chinese-llama2 7B DPO训练时loss=0,eval=nan
#239
Everyday-seu
closed
10 months ago
6
inference.py和gradio_demo.py结果不一致
#238
ozmemory
closed
7 months ago
3
双节点(每个节点都是双卡A5000)训练baichuan2-13B-chat报错
#237
sweetboxwwy
closed
11 months ago
0
请问baichuan13b-chat增量预训练(LoRA)需要多少显存呢
#236
sweetboxwwy
closed
11 months ago
10
单机2张A5000(单张卡显存24G)卡训练baichuan13B-chat,训练到一半时watch -n 1 nvidia-smi 看到显存占用明显溢出了,报错torch.distributed.elastic.multiprocessing.errors.ChildFailedError
#235
sweetboxwwy
closed
11 months ago
1
训练loss在每个epoch之后都会突然下降
#234
jiangtann
closed
11 months ago
1
使用chinese-llama-alpaca 7b进行dpo训练报错,求教!
#233
Everyday-seu
closed
7 months ago
2
使用Baichuan2-7B-Chat-4bits SFT 报错 "bitsandbytes>=0.37.0"
#232
dage0127
closed
11 months ago
4
预训练的思路问题
#231
J-G-Y
closed
10 months ago
1
Chat模型训练问题
#230
ozmemory
closed
11 months ago
1
run baichuan2 with error : AttributeError: 'NormHead' object has no attribute 'in_features'
#229
daiyizheng
closed
11 months ago
1
请问是否支持chatglm2-6b-32k?
#228
LanHao0
closed
7 months ago
2
关于训练数据的随机化问题
#227
nuoma
closed
11 months ago
2
DPO
#226
C929-x
closed
11 months ago
5
Previous
Next