-
### Context
This task regards enabling tests for **baichuan2-7b-chat**. You can find more details under openvino_notebooks [LLM chatbot README.md](https://github.com/openvinotoolkit/openvino_notebook…
-
为啥前面一哥们说报显存不够?
-
**Describe the bug**
when fine-tuning my model using deepspeed==0.13.5, and huggingface trainer, loss and grad_norm will be nan at step 2
![image](https://github.com/microsoft/DeepSpeed/assets/29994…
-
when running the code from readme
`tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan2-13B-Chat", use_fast=False, trust_remote_code=True)`
got error:
-----------------------------…
-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/baichuan-inc/baichuan-7B/issues) and [Discussions](https://github.com/bai…
-
Using pad_token, but it is not set yet.
Loading base model for ppo training...
加载base
加载lora
加载ppo
WARNING:root:A model is loaded from '/root/autodl-tmp/LLM/weights/sft_lora', and no v_head weig…
-
-
## Issue1 on xpu with python 3.10 [Fixed after releasing bigdl-core-xe and bigdl-core-xe-esimd for python 3.10]
on Arc14, I followed https://github.com/intel-analytics/BigDL/blob/main/python/llm/exa…
-
如题
-
作者您好!我对您的工作非常感兴趣,同时我拿了您发布的权重,想要测试一下您的模型,结果发现效果并不是很理想,我问的是几个您在文档中有提及的问题,下面是我的测试记录
llama-7b模型本身就非常容易陷入胡说八道的情况,目前我也在做和您类似的工作,我用的是alpaca-7b的lora 微调算法,发现效果要远好于llama。同时扩充中文词汇量的工作也有人做过了,lora训练后的效果有大幅提升。不知…