[Question] 2行代码开启baichuan-7B的LoRA微调

beyondguo commented 1 year ago

Required prerequisites

[X] I have read the documentation https://github.com/baichuan-inc/baichuan-7B/blob/HEAD/README.md.
[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[ ] Consider asking first in a Discussion.

Questions

2 行代码开启 baichuan-7B 的 LoRA 微调

地址：https://github.com/beyondguo/LLM-Tuning
方式：sh tokenize.sh 然后 sh train.sh 即可

效果展示：

下面实现了让 baichuan-7B 大模型从新闻中按照 json 格式抽取 <公司, 情感, 原因> 三元组（原始baichuan-7B 模型无法抽取）：

Checklist

[X] I have provided all relevant and necessary information above.
[X] I have chosen a suitable title for this issue.

yzxyzh commented 1 year ago

你好，请问在你的这个示例里面微调的时候，使用的loss就是一般的sft的loss吗？就比如输入是新闻，输出是{ORG:XXX}这种json格式的字符串。我看了下你提供的那个repo，你的repo里面似乎用了一个数字的loss，数据集和任务也和你这里展示的示例不同，所以想问一下。

beyondguo commented 1 year ago

我这里展示的是某个特定任务的数据格式，你就使用通用的instructing tuning语料也可以训练。

wxl18039675170 commented 1 year ago

@beyondguo 你好，在特定任务上，指令微调后的chatglm和baichuan那个效果更好？

beyondguo commented 1 year ago

如果你的指令很强的话，比如让它编程、修改文本格式、精细信息抽取，目前直接微调 ChatGLM 可能更好。但这不是说 ChatGLM 优于 baichuan，而是 ChatGLM 已经经过了整个 RLHF 过程的指令训练。可以等 baichuan 官方的 RLHF 版本出来之后，再进行对比，我估计效果也很不错。

RileyShe commented 1 year ago

我这里展示的是某个特定任务的数据格式，你就使用通用的instructing tuning语料也可以训练。

https://github.com/beyondguo/LLM-Tuning 请问是否支持chatglm2-6b ？

angel1288 commented 1 year ago

您好，我在进行微调的时候遇到以下报错： Traceback (most recent call last): File "/data3/LLM-Tuning/chatglm_lora_tuning.py", line 142, in main() File "/data3/LLM-Tuning/chatglm_lora_tuning.py", line 128, in main trainer = ModifiedTrainer( File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/transformers/trainer.py", line 499, in init self._move_model_to_device(model, args.device) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/transformers/trainer.py", line 741, in _move_model_to_device model = model.to(device) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to return self._apply(convert) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) [Previous line repeated 3 more times] File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply param_applied = fn(param) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) NotImplementedError: Cannot copy out of meta tensor; no data! 请问有遇到过吗？

beyondguo commented 1 year ago

我这里展示的是某个特定任务的数据格式，你就使用通用的instructing tuning语料也可以训练。

https://github.com/beyondguo/LLM-Tuning 请问是否支持chatglm2-6b ？

我近期会跑一下看看

beyondguo commented 1 year ago

@angel1288 你这个看起来感觉是包版本的问题，我按照我仓库里提供的版本配置一下呢？

angel1288 commented 1 year ago

@beyondguo 您好，git中给的测试环境都一致了，还是不行呢，是不是其他有些包版本影响的呢

angel1288 commented 1 year ago

@beyondguo 您好，git中给的测试环境都一致了，还是不行呢，是不是其他有些包版本影响的呢 @beyondguo chatglm中AutoModel.from_pretrained添加了empty_init=False的配置，不报上面的错误了，现在报错： File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/transformers/trainer.py", line 1664, in train return inner_training_loop( File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/transformers/trainer.py", line 2735, in training_step loss = self.compute_loss(model, inputs) File "/data3/push_recall/LLM-Tuning/chatglm_lora_tuning.py", line 57, in compute_loss return model( File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/peft/peft_model.py", line 678, in forward return self.base_model( File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(args, kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1190, in forward transformer_outputs = self.transformer( File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/data3/env/miniconda3/envs/baichuan/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(args, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 936, in forward attention_mask = self.get_masks( File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 682, in get_masks context_lengths = [seq.tolist().index(self.config.bos_token_id) for seq in input_ids] File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 682, in context_lengths = [seq.tolist().index(self.config.bos_token_id) for seq in input_ids] ValueError: 130004 is not in list

beyondguo commented 1 year ago

@angel1288 你使用的是最新版的ChatGLM-6B吗，他们之前更新过

angel1288 commented 1 year ago

@beyondguo 现在这个微调不支持新的模型吗？

angel1288 commented 1 year ago

@beyondguo baichuan那个配置多卡可以跑起来，单卡就报错：NotImplementedError: Cannot copy out of meta tensor; no data!

beyondguo commented 1 year ago

@beyondguo 现在这个微调不支持新的模型吗？

@angel1288 我是说你用的可能是旧版的，导致报错，我这个是对应新版ChatGLM-6B。每个模型的config可能不一样，所以要针对性编写。

beyondguo commented 1 year ago

@RileyShe 现已支持ChatGLM2-6B，相关讨论请移至项目评论区 https://github.com/beyondguo/LLM-Tuning

这里主要讨论baichuan模型

angel1288 commented 1 year ago

@beyondguo 现在这个微调不支持新的模型吗？

@angel1288 我是说你用的可能是旧版的，导致报错，我这个是对应新版ChatGLM-6B。每个模型的config可能不一样，所以要针对性编写。

@beyondguo 嗯嗯~您好，我看了一下模型哈，vocab_size 150528这个在cpu_quan分支呢，main分支已经更新为130528~

beyondguo commented 1 year ago

https://github.com/beyondguo/LLM-Tuning/tree/master/projects/ChatBaichuan-HC3

这里展示了如何使用 HC3 数据集来让 baichuan-7B 有对话能力。

效果：

>>> chat("哎，最近晚上睡不着！")

原始 baichuan-7B：

你是不是想多了？
1、我有一个朋友。他喜欢一个女生很久了，但是一直没有勇气表白。有一次他们出去玩的时候，那个男生不小心把脚给扭到了。然后他就对那位女生说:“你看我的脚都肿成这样子了，还怎么去见你啊?”结果没想到的是，这位女生竟然回答道:“没事的，没关系!2、今天在街上看到一对情侣，女的长得挺漂亮，男的大腹便便，肚子鼓的老高老高的，走路一摇一摆的，看着很不协调。于是我就走过去拍了一下他的肩膀说道:哥们儿，这身材真好呀，有福气哦~谁知那货回头看了看我说到:兄弟别误会，这是我老婆...3、昨天和女朋友吵架，她摔门而出去了外面住酒店。早上起来发现手机上有条短信“对不起，您昨晚消费0元”当时就怒了，什么人啊，查岗也查这么清楚，立马打电话给她质问，电话通了传来一个温柔的声音“对不起，我们是移动公司，刚才查询您的账单时看到了您的男朋友在外面开房，所以特地发个信息提醒一下。......

ChatBaichun-HC3:

很抱歉听到你感到困扰。睡眠问题是很常见的，并且可能对你的生活造成很大的影响。如果你正在经历失眠的困扰，这里有一些建议可以帮助你改善睡眠质量：尝试睡前放松：在睡觉前花一些时间进行冥想、呼吸练习或其他有助于放松的活动。保持卧室安静舒适：尽量避免噪音和光线干扰，并确保床垫软硬适中。保证充足的睡眠时间：每天至少要保证7-8小时的睡眠时间。如果无法达到这个目标，你可以考虑调整自己的作息习惯来增加睡眠时间。寻找原因：有时候，睡眠问题的出现可能是由于某些因素导致的，例如压力过大、身体不适或者饮食不规律等。因此，不妨找出导致睡眠问题的根本原因，然后采取适当的措施加以解决。希望这些建议能对你有所帮助。

>>> chat("你是谁开发的啊")

原始 baichuan-7B：

我就是你，我是你自己。(自性)

ChatBaichun-HC3:

我是一个计算机程序，由一个人或一群人编写。我的目的是帮助人们解决问题和回答问题。

baichuan-inc / Baichuan-7B