UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.21k stars 2.47k forks source link

Adding deepspeed config #2833

Open imrankh46 opened 3 months ago

imrankh46 commented 3 months ago

@tomaarsen hello tom. I hope you will good.

I am trying to add deepspeed in sentence transformer training argument via deepspeed= "deepspeed_config.json" and also try with accelerate config but it's not working.

Here is the example of my config.

     "bf16": {
         "enabled": true
     },
     "zero_optimization": {
         "stage": 3,
         "stage3_gather_16bit_weights_on_model_save": false,
         "offload_optimizer": {
             "device": "none"
         },
         "offload_param": {
             "device": "none"
         }
     },
     "gradient_clipping": 1.0,
     "train_batch_size": "auto",
     "train_micro_batch_size_per_gpu": "auto",
     "gradient_accumulation_steps": 10,
     "steps_per_print": 2000000
 }

Here is training arg ie

# Now proceed as normal, plus pass the deepspeed config file
training_args = TrainingArguments(..., deepspeed="ds_config_zero3.json")
trainer = Trainer(...)
trainer.train()

You can check the source https://huggingface.co/docs/transformers/v4.26.1/en/main_classes/deepspeed

tomaarsen commented 2 months ago

Hello!

Apologies for the delay, I've been recovering from a surgery this last month. I'm not very familiar with Deepspeed, I've only used it a couple of times myself. Could you share some information about what goes wrong? Also, your last snippet shows that you're using Trainer rather than SentenceTransformerTrainer, perhaps that's an issue?