ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.89k stars 505 forks source link

Please correct this json configuration file to use with accelerate config - for 8GB VRAM cards #189

Open runner22k opened 1 year ago

runner22k commented 1 year ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is.

I'm always frustrated when accelerate config as a non-programmer. I dont know what options to select to configure deepspeed. It asks for a json file too. So, I want any developer to look into this JSON file I got (from chatGPT - sorry I couldn't find solution anywhere) and to review and correct. So that, non technical people can just give a json file path to accelerate config to work with 8GB VRAM cards.

Describe the solution you'd like A clear and concise description of what you want to happen

This configuration file should work with graphics cards having just 8GB VRAM. Please review and correct this Json file to work with accelerate config

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered. I don't know of any alternatives for this.

Additional context

{
    "train_batch_size": 8,
    "train_batch_size_per_gpu": 8,
    "gradient_accumulation_steps": 2,
    "fp16": true,
    "fp16_loss_scale": 0,
    "fp16_scale_tolerance": 0.0,
    "fp16_loss_scale_window": 1000,
    "loss_scale": "dynamic",
    "optimizer": "LAMB",
    "lr": 0.0001,
    "warmup_proportion": 0.1,
    "max_steps": -1,
    "n_gpu": 1,
    "local_rank": -1,
    "seed": 42,
    "log_freq": 100,
    "gradient_clip": 0.5,
    "save_dir": "./checkpoints",
    "model_name": "bert-base-uncased",
    "data_dir": "./data",
    "output_dir": "./output",
    "do_train": true,
    "do_eval": true,
    "do_test": true,
    "deepspeed_config": {
            "stage2": {
                "offload_optimizer_to_cpu": true,
                "offload_param_to_cpu": true
            },
            "memory_efficient_fp16": true,
            "zero_optimizer": true
    }
}

chatGPT's explanation and reasons to choose the options and optimizations for 8GBVRAM

In this example, the configuration file set "gradient_accumulation_steps" to 2, so gradients are accumulated across 2 mini-batches before performing a weight update. This can help to reduce the memory footprint of the model and allow you to use larger batch sizes with a limited amount of VRAM.

It also set "optimizer" to "LAMB", this optimizer is memory-efficient as well as zero_optimizer option is set to "True", this option reduces the memory usage of optimizer by zeroing optimizer's gradients before backward pass.

It also set "fp16" to "True" and "memory_efficient_fp16" to "True", this option enables FP16 precision training with memory-efficient techniques, it reduces the memory usage of the model.

It also set "offload_param_to_cpu" and "offload_optimizer_to_cpu" options to "True" this option offloads the model's parameters and optimizer state to CPU memory, which can help to reduce the memory footprint of the model.

Please note that, This is an example of how you can configure the options for training a model with DeepSpeed with limited VRAM and it can be adjusted to suit your specific use case and the resources you have.