-
I am trying to deploy a Baichuan2-7B model on a machine with 2 Tesla V100 GPUs. Unfortunately each V100 has only 16GB memory.
I have applied INT8 weight-only quantization, so the size of the engine I…
-
使用最新版belle对baichuan2-7b进行sft全程lr=0,训练步数超过warmup步数之后依旧提示:
tried to get lr value before scheduler/optimizer started stepping, returning lr=0
运行脚本:
![image](https://github.com/baichuan-inc/Baichuan2…
-
FileNotFoundError: [Errno 2] No such file or directory: 'baichuan-inc/Baichuan2-13B-Chat-4bits/pytorch_model.bin'
What is the reason of downloading error.
-
when running the code from readme
`tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan2-13B-Chat", use_fast=False, trust_remote_code=True)`
got error:
-----------------------------…
-
如题,使用的默认数据
deepspeed --include=localhost:4,5,7 fine-tune.py \
--report_to "none" \
--data_path "data/belle_chat_ramdon_10k.json" \
--model_name_or_path "/home/admin/baichuan2/baichuan…
-
在微调baichuan2-7B-Base模型的时候,发现,输入的token长度不能超过512,其官方给出的最大长度为4096。
微调遵循官方教程,使用lora的方式,在微调过程中使用官方数据集,在数据中添加超长数据时出现
Token indices sequence length is longer than the specified maximum sequence length for …
-
tokenizer: `moka-ai/m3e-base`
错误:
```shell
Traceback (most recent call last):
File "webui_xbl_stable.py", line 449, in
model_status = init_model()
File "webui_xbl_stable.py", line 166…
-
Hi team,
I want to release the related memory via del model variable after model generate, but it does not work as my expectation.
The demo code is as below,
import torch
import time
import n…
-
The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the…
-
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._ta…