huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.97k stars 970 forks source link

Command line arguments related to deepspeed for `accelerate launch` do not override those of `default_config.yaml` #3203

Open JdbermeoUZH opened 3 weeks ago

JdbermeoUZH commented 3 weeks ago

System Info

- `Accelerate` version: 0.27.0
- Platform: Linux-4.19.0-27-amd64-x86_64-with-glibc2.28
- Python version: 3.10.13
- Numpy version: 1.26.3
- PyTorch version (GPU?): 2.1.0 (False)
- PyTorch XPU available: False
- PyTorch NPU available: False
- System RAM: 31.33 GB
- `Accelerate` default config:
        - compute_environment: LOCAL_MACHINE
        - distributed_type: DEEPSPEED
        - mixed_precision: fp16
        - use_cpu: False
        - debug: True
        - num_processes: 4
        - machine_rank: 0
        - num_machines: 1
        - rdzv_backend: static
        - same_network: True
        - main_training_function: main
        - deepspeed_config: {'gradient_accumulation_steps': 4, 'gradient_clipping': 0.5, 'zero3_init_flag': False, 'zero_stage': 0}
        - downcast_bf16: no
        - tpu_use_cluster: False
        - tpu_use_sudo: False
        - tpu_env: []
        - dynamo_config: {'dynamo_backend': 'NVPRIMS_NVFUSER'}

Information

Tasks

Reproduction

  1. Choose parameters in accelerate config
  2. accelerate launch --gradient_accumulation_steps --num_processes

Expected behavior

both num_processes and gradient_accumulation_steps are overridden. Currently only num_processes is overridden