Open jeffra opened 4 years ago
@jeffra on a somewhat related note - a lot of the documentation links in the README file are now throwing 404 errors. Could you please take a look at these? Thanks!
Hi @g-karthik , sorry for the disruption. I broke some links while moving over to our new website: https://www.deepspeed.ai/
The links in the README should be fixed again. The bulk of the documentation is now on our webpage and hosted from docs/
.
@ShadenSmith thanks for the fix! I still see one 404: https://www.deepspeed.ai/docs/config_json/
@g-karthik whoops, I thought I had fixed that one last night. New link is https://www.deepspeed.ai/docs/config-json/
Thanks for the reports! Feedback is always appreciated.
@jeffra Could we add this? Important: Make sure to include the following line in your code to add DeepSpeed configuration arguments:
deepspeed.add_config_arguments(parser)
Before calling deepspeed.initialize, make sure to define and populate cmd_args with the necessary arguments required for configuring DeepSpeed. These arguments might include parameters for optimization, memory optimization, gradient compression, etc. The specific arguments you need to include in cmd_args depend on your training requirements and the DeepSpeed features you want to utilize. Here are some examples of common arguments you might include:
--optimizer: Specifies the optimizer to be used (e.g., --optimizer adam).
--memory-optimization: Enables memory optimization techniques (e.g., --memory-optimization off).
--gradient-compression: Enables gradient compression (e.g., --gradient-compression fp16).
--zero-redundancy: Enables zero redundancy optimizer state partitioning (e.g., --zero-redundancy off).
--activation-checkpointing: Enables activation checkpointing for memory optimization (e.g., --activation-checkpointing off).
--fp16: Enables mixed-precision training using 16-bit floating-point precision (e.g., --fp16 off).
--amp: Enables automatic mixed precision training using PyTorch's AMP (e.g., --amp off).
--local_rank: Specifies the local rank for distributed training (e.g., --local_rank 0).
Refer to the DeepSpeed documentation, code examples, and tutorials for a comprehensive list of available arguments and their descriptions. Customize cmd_args based on your training needs and the specific DeepSpeed features you want to leverage.
@g-karthik @ShadenSmith would either of you be able to review my previous comment and provide thoughts otherwise?
We are missing information in our getting started guide about what
cmd_args
needs to have when executing deepspeed.initialize.https://github.com/microsoft/DeepSpeed#writing-deepspeed-models
Need to add some text about adding
deepspeed.add_config_arguments(parser)
to user code.