axolotl-ai-cloud / axolotl

Go ahead and axolotl questions
https://axolotl-ai-cloud.github.io/axolotl/
Apache License 2.0
7.61k stars 829 forks source link

Add example for deepspeed config with cpu offloading. #1216

Open PhilipMay opened 8 months ago

PhilipMay commented 8 months ago

⚠️ Please check that this feature request hasn't been suggested before.

πŸ”– Feature description

I would like to use cpu offloading. There is no example config for deepspeed.

βœ”οΈ Solution

Provide a config.

Maybe this - but I am not 100% sure:

"zero_optimization": {
    "offload_optimizer": {
        "device": "cpu",
        "pin_memory": true
    },
    "offload_param": {
        "device": "cpu",
        "pin_memory": true
    },
    "stage": 3,
    "overlap_comm": true,
    "contiguous_gradients": true,
    "sub_group_size":  1e9,
    "reduce_bucket_size": "auto",
    "stage3_prefetch_bucket_size": "auto",
    "stage3_param_persistence_threshold": "auto",
    "stage3_max_live_parameters": 1e9,
    "stage3_max_reuse_distance": 1e9,
    "stage3_gather_16bit_weights_on_model_save": true
  },
  ...

❓ Alternatives

No response

πŸ“ Additional Context

No response

Acknowledgements

noobmaster29 commented 8 months ago

The example deepspeed configs are in the deepspeed_configs folder.

PhilipMay commented 8 months ago

The example deepspeed configs are in the deepspeed_configs folder.

@noobmaster29 But they do not offer an CPU offloading config.

NanoCode012 commented 7 months ago

zero3 used to have offload, but was removed. If you want to make a PR, you could re-add the old revision for zero3 but with a slightly different name like zero3_cpuoffload

NanoCode012 commented 5 months ago

Added a PR #1466 for this.