-
### System Info
OS version: Ubantu 18.04.1
GPU: Rtx 2080
Nvidia & Cuda:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: …
-
请问一下,我进行全量ft之后,推理的显存从14000MB上升到27000MB,这是什么原因?另一方面,我这边的训练样本输入token都挺大的,超过3000,当设置为3000的时候,1.1b的bloomz单张46G的L40S只能单batch+zero2 offload训练,不然会oom,这正常吗?
-
Hi all! I'm working on looking at this library as an alternative to transformers engine for the [accelerate project](https://github.com/huggingface/accelerate) and had a few general questions:
* On…
-
### 请提出你的问题
目前只看到在使用AutoModelForCausalLM在加载模型时,自动下载通过convert_from_torch参数转换。用于服务器无法访问外网,我手动下载的Hugingface上面的模型到本地,然后加载,发现加载本地模型转换就会报一些奇奇怪怪的错误,目前使用bloom、llama和baichuan,都没有转换成功?
我看同类的框架(hf transfor…
-
### System Info
accelerate==0.26.1, transformers==4.37.0, peft==0.7.1, torch==2.1.2, sagemaker notebook
### Who can help?
Hi, I was running the promp tuning tutorial from hugginface shown in this l…
-
**Describe the bug**
I want to use deepspeed by a script, and I installed it with pip:
```
(base) forestbat@vm-jupyterhub-server:~/BELLE/train$ pip install deepspeed
Defaulting to user installatio…
-
![image](https://github.com/MartialBE/one-api/assets/53845535/cc79da65-67ee-457c-b9be-c024653edc18)
选择两个分组,并且选择填入所有模型,就会报SQL错误
当前版本:dev-27738aa
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
执行如下命令:accelerate launch src/train_bash.py --stage sft --model_name_or_path bigscience/bloomz-560m --…
-
### System Info
tgi `1.1.0`
linux + docker
### Information
- [X] Docker
- [ ] The CLI directly
### Tasks
- [X] An officially supported command
- [ ] My own modifications
### Reproduction
using…
-
Since DPO workflow doesn't support do_predict, I'm trying to export the model and then run do_predict with stf workflow. But the predictions I'm getting are empty strings.
```
python src/export_mo…