motional / nuplan-devkit

The devkit of the nuPlan dataset.
https://www.nuplan.org
Other
673 stars 129 forks source link

Since update_config_for_training() will set cfg.lightning.trainer.params.gpus = -1 , why bother allow setting gpus in config? #260

Closed HeYDwane3 closed 1 year ago

HeYDwane3 commented 1 year ago
def update_config_for_training(cfg: DictConfig) -> None:
    """
    Updates the config based on some conditions.
    :param cfg: omegaconf dictionary that is used to run the experiment.
    """
    # Make the configuration editable.
    OmegaConf.set_struct(cfg, False)

    if cfg.cache.cache_path is None:
        logger.warning('Parameter cache_path is not set, caching is disabled')
    else:
        if not str(cfg.cache.cache_path).startswith('s3://'):
            if cfg.cache.cleanup_cache and Path(cfg.cache.cache_path).exists():
                rmtree(cfg.cache.cache_path)

            Path(cfg.cache.cache_path).mkdir(parents=True, exist_ok=True)

    if cfg.lightning.trainer.overfitting.enable:
        cfg.data_loader.params.num_workers = 0

    if cfg.gpu and torch.cuda.is_available():
        cfg.lightning.trainer.params.gpus = -1
    else:
        cfg.lightning.trainer.params.gpus = None
        cfg.lightning.trainer.params.accelerator = None
        cfg.lightning.trainer.params.precision = 32

    # Save all interpolations and remove keys that were only used for interpolation and have no further use.
    OmegaConf.resolve(cfg)

    # Finalize the configuration and make it non-editable.
    OmegaConf.set_struct(cfg, True)

    # Log the final configuration after all overrides, interpolations and updates.
    if cfg.log_config:
        logger.info(f'Creating experiment name [{cfg.experiment}] in group [{cfg.group}] with config...')
        logger.info('\n' + OmegaConf.to_yaml(cfg))

Since in update_config_for_training(), it will automatically set cfg.lightning.trainer.params.gpus = -1, why bother allowing overide this value, like in default_lightning.yaml

patk-motional commented 1 year ago

@HeYDwane3,

You are right. Generally speaking, we always use all the GPUs available to us during training. We will make a note in the default_lightning.yaml that the parameter has no effects until further notice.