-
I am using ds 0.15.1 on two A6000 GPUs, following the [huggingface Non-Trainer DeepSpeed integration](https://huggingface.co/docs/transformers/main/en/deepspeed?models=pretrained+model#non-trainer-dee…
-
### System Info
Hardware: Amazon Linux EC2 Instance.
8 NVIDIA A10G (23 GB)
```
Python 3.10.14
CUDA Version: 12.4
accelerate==0.34.2
bitsandbytes==0.44.1
nvidia-cublas-cu12==12.1.3.1
nvidi…
-
The following program is crashing when I am trying to access invoke `std::string t = config.Test("test");` in the callback `cb`. The reason looks to because config is not able to access the memory (?…
-
I am working on a use case of loading a model with parallel gpus, then unloading the model, and loading a new model in the same process.
```
@classmethod
async def unload_models(cls, exiting=…
-
Why:
- dont have to recompile every time!
- can define once - referance everywhere
- unlocks variation of parameter experiments
example, move "calibrations" to a json file like resolution in th…
-
There were no errors on 5.15.0-210.163.7.el8uek.x86_64
[root@localhost ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.10 (Ootpa)
[root@localhost ~]# uname -a
Linux localhost.l…
-
Running the code from the Model Card, I get this error:
`ValueError: 'dac' is already used by a Transformers config, pick another name.`
``` File "C:\Users\XXXX\AppData\Roaming\Python\Python311\…
-
The name says it all. Auto start even when set to false is always active, to clarify i mean that the ship will always travel to the planet upon exploring even with AutoStart set to false in configs.
…
-
您好!我在64卡上外推72b模型时一直遇到OOM的问题,是不是multi_node.yaml中配置错了?
multi_node.yaml
`debug: false
deepspeed_config:
deepspeed_config_file: utils/accelerate_configs/zero3_offload.json
deepspeed_multinode_lau…
-
使用ms-swift版本为2.6.0.dev0,transformers库为4.45.2时报错
[rank1]: Traceback (most recent call last):
[rank1]: File "/home/xxx/anaconda3/envs/f_got/lib/python3.10/site-packages/transformers/models/auto/conf…