Open carmocca opened 1 year ago
DeepSpeed works by using a configuration file (dictionary) that allows customizing all of its aspects: https://www.deepspeed.ai/docs/config-json/
The DeepSpeedStrategy supports two ways of defining this:
DeepSpeedStrategy
__init__
Option 2 is not scalable because:
Remove all these exposed arguments and just have a config argument that overloads support for:
config
DeepSpeedStrategy(config="my/config/path.json")
config = ds.runtime.config.DeepSpeedConfig({"train_micro_batch_size_per_gpu": 2}) DeepSpeedStrategy(config=config)
config = {"zero_optimization": {"offload_optimizer": {"device": "cpu"}}} DeepSpeedStrategy(config=config)
Where the default config is created by calling: https://github.com/microsoft/DeepSpeed/blob/085981bf1caf5d7d0b26d05f7c7e9487e1b35190/deepspeed/runtime/config.py#L674
DeepSpeed is considered experimental so we could do this breaking change: https://github.com/Lightning-AI/lightning/blob/b792c90ea7148d61af192fde6c338ebbd355702f/src/lightning/fabric/strategies/deepspeed.py#L99
cc @justusschock @awaelchli @carmocca
+1 for this!
Outline & Motivation
DeepSpeed works by using a configuration file (dictionary) that allows customizing all of its aspects: https://www.deepspeed.ai/docs/config-json/
The
DeepSpeedStrategy
supports two ways of defining this:__init__
that are used to define a base config. https://github.com/Lightning-AI/lightning/blob/b792c90ea7148d61af192fde6c338ebbd355702f/src/lightning/fabric/strategies/deepspeed.py#L242-L271Option 2 is not scalable because:
Pitch
Remove all these exposed arguments and just have a
config
argument that overloads support for:Where the default config is created by calling: https://github.com/microsoft/DeepSpeed/blob/085981bf1caf5d7d0b26d05f7c7e9487e1b35190/deepspeed/runtime/config.py#L674
Additional context
DeepSpeed is considered experimental so we could do this breaking change: https://github.com/Lightning-AI/lightning/blob/b792c90ea7148d61af192fde6c338ebbd355702f/src/lightning/fabric/strategies/deepspeed.py#L99
cc @justusschock @awaelchli @carmocca