Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.31k stars 3.38k forks source link

LightningCLI: --help argument given after the subcommand fails #20199

Open nisar2 opened 2 months ago

nisar2 commented 2 months ago

Bug description

I'm trying to use LightningCLI to configure my code from the command line but LightningCLI is seeming to have trouble parsing the default logger of the trainer when I run the following:

python test.py fit -h

However, just running the command without the -h flag works.

What version are you seeing the problem on?

v2.4

How to reproduce the bug

import torch
from torch.utils.data import DataLoader, Dataset

from pytorch_lightning import LightningModule
from pytorch_lightning.cli import LightningCLI

class RandomDataset(Dataset):
    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len

class RandomModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)

    def train_dataloader(self):
        return DataLoader(RandomDataset(32, 64))

def main():
    cli = LightningCLI(RandomModel)

if __name__=='__main__':
    main()

Error messages and logs

ValueError: Not possible to determine the import path for object typing.Iterable[pytorch_lightning.loggers.logger.Logger].

Environment

Current environment * CUDA: - GPU: - NVIDIA GeForce RTX 4090 - available: True - version: 12.1 * Lightning: - lightning: 2.4.0 - lightning-utilities: 0.11.6 - pytorch-lightning: 2.4.0 - torch: 2.4.0 - torchaudio: 2.4.0 - torchmetrics: 1.4.0.post0 - torchvision: 0.19.0 * Packages: - accelerate: 0.21.0 - asttokens: 2.0.5 - autocommand: 2.2.2 - backcall: 0.2.0 - backports.tarfile: 1.2.0 - bottleneck: 1.3.7 - brotli: 1.0.9 - certifi: 2024.7.4 - charset-normalizer: 3.3.2 - colorama: 0.4.6 - comm: 0.2.2 - contourpy: 1.2.0 - cycler: 0.11.0 - debugpy: 1.6.7 - decorator: 5.1.1 - diffusers: 0.18.2 - docstring-parser: 0.16 - entrypoints: 0.4 - exceptiongroup: 1.2.0 - executing: 0.8.3 - filelock: 3.13.1 - fonttools: 4.51.0 - fsspec: 2024.6.1 - gmpy2: 2.1.2 - huggingface-hub: 0.23.1 - idna: 3.7 - importlib-metadata: 7.0.1 - importlib-resources: 6.4.0 - inflect: 7.3.1 - ipykernel: 6.29.5 - ipython: 8.15.0 - jaraco.context: 5.3.0 - jaraco.functools: 4.0.1 - jaraco.text: 3.12.1 - jedi: 0.19.1 - jinja2: 3.1.4 - joblib: 1.4.2 - jsonargparse: 4.32.0 - jupyter-client: 7.4.9 - jupyter-core: 5.7.2 - kiwisolver: 1.4.4 - lightning: 2.4.0 - lightning-utilities: 0.11.6 - markupsafe: 2.1.3 - matplotlib: 3.8.4 - matplotlib-inline: 0.1.6 - mkl-fft: 1.3.8 - mkl-random: 1.2.4 - mkl-service: 2.4.0 - more-itertools: 10.3.0 - mpmath: 1.3.0 - nest-asyncio: 1.6.0 - networkx: 3.2.1 - numexpr: 2.8.7 - numpy: 1.26.4 - ordered-set: 4.1.0 - packaging: 24.1 - pandas: 2.2.2 - parso: 0.8.3 - pickleshare: 0.7.5 - pillow: 10.4.0 - pip: 24.2 - platformdirs: 4.2.2 - ply: 3.11 - prompt-toolkit: 3.0.43 - psutil: 5.9.0 - pure-eval: 0.2.2 - pygments: 2.15.1 - pyparsing: 3.0.9 - pyqt5: 5.15.10 - pyqt5-sip: 12.13.0 - pysocks: 1.7.1 - python-dateutil: 2.9.0.post0 - pytorch-lightning: 2.4.0 - pytz: 2024.1 - pywin32: 305.1 - pyyaml: 6.0.1 - pyzmq: 24.0.1 - regex: 2024.7.24 - requests: 2.32.3 - scikit-learn: 1.5.1 - scipy: 1.13.1 - setuptools: 72.1.0 - sip: 6.7.12 - six: 1.16.0 - stack-data: 0.2.0 - sympy: 1.12 - threadpoolctl: 3.5.0 - tomli: 2.0.1 - torch: 2.4.0 - torchaudio: 2.4.0 - torchmetrics: 1.4.0.post0 - torchvision: 0.19.0 - tornado: 6.4.1 - tqdm: 4.66.5 - traitlets: 5.14.3 - typeguard: 4.3.0 - typeshed-client: 2.7.0 - typing-extensions: 4.11.0 - tzdata: 2023.3 - unicodedata2: 15.1.0 - urllib3: 2.2.2 - wcwidth: 0.2.5 - wheel: 0.43.0 - win-inet-pton: 1.1.0 - zipp: 3.17.0 * System: - OS: Windows - architecture: - 64bit - WindowsPE - processor: Intel64 Family 6 Model 183 Stepping 1, GenuineIntel - python: 3.9.19 - release: 10 - version: 10.0.22631

More info

I'm wondering if this is a bug, or just something I'm doing wrong with my setup?

Also, without the -h flag, the code runs fine. However my config file has the following ouput for the trainer:

trainer:
  accelerator: auto
  strategy: auto
  devices: auto
  num_nodes: 1
  precision: null
  logger: null
  callbacks: null
  fast_dev_run: false
  max_epochs: null
  min_epochs: null
  max_steps: -1
  min_steps: null
  max_time: null
  limit_train_batches: null
  limit_val_batches: null
  limit_test_batches: null
  limit_predict_batches: null
  overfit_batches: 0.0
  val_check_interval: null
  check_val_every_n_epoch: 1
  num_sanity_val_steps: null
  log_every_n_steps: null
  enable_checkpointing: null
  enable_progress_bar: null
  enable_model_summary: null
  accumulate_grad_batches: 1
  gradient_clip_val: null
  gradient_clip_algorithm: null
  deterministic: null
  benchmark: null
  inference_mode: true
  use_distributed_sampler: true
  profiler: null
  detect_anomaly: false
  barebones: false
  plugins: null
  sync_batchnorm: false
  reload_dataloaders_every_n_epochs: 0
  default_root_dir: null

Should there really be so many null values?

mauvilsa commented 2 months ago

Should there really be so many null values?

Among the purposes of LightningCLI are reproducibility and reporting. The idea is that the saved config includes all settings to meet these goals. There are many null values because many parameters have as default None.

With the code snippet above, the issue doesn't happen for me. That error comes from jsonargparse. To get more information you set environment variable export JSONARGPARSE_DEBUG=true (not sure what is the equivalent for this in windows) and then run the script. If you post the output here I might be able to tell more.

apple2373 commented 1 month ago

I got the same error. I'm using Ubuntu with python 3.9 and lightning 2.4. The exact environment is attached as yaml file. env.yaml.txt

While python test.py fit works normally, python test.py fit -h gives the following error.

$ python test.py fit -h
2024-09-14 01:10:19,565 - LightningArgumentParser - DEBUG - Skipping parameter "model" from "pytorch_lightning.Trainer.fit" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,566 - LightningArgumentParser - DEBUG - Skipping parameter "train_dataloaders" from "pytorch_lightning.Trainer.fit" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,566 - LightningArgumentParser - DEBUG - Skipping parameter "val_dataloaders" from "pytorch_lightning.Trainer.fit" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,566 - LightningArgumentParser - DEBUG - Skipping parameter "datamodule" from "pytorch_lightning.Trainer.fit" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,586 - LightningArgumentParser - DEBUG - Skipping parameter "model" from "pytorch_lightning.Trainer.validate" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,586 - LightningArgumentParser - DEBUG - Skipping parameter "dataloaders" from "pytorch_lightning.Trainer.validate" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,587 - LightningArgumentParser - DEBUG - Skipping parameter "datamodule" from "pytorch_lightning.Trainer.validate" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,611 - LightningArgumentParser - DEBUG - Skipping parameter "model" from "pytorch_lightning.Trainer.test" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,611 - LightningArgumentParser - DEBUG - Skipping parameter "dataloaders" from "pytorch_lightning.Trainer.test" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,611 - LightningArgumentParser - DEBUG - Skipping parameter "datamodule" from "pytorch_lightning.Trainer.test" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,639 - LightningArgumentParser - DEBUG - Skipping parameter "model" from "pytorch_lightning.Trainer.predict" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,639 - LightningArgumentParser - DEBUG - Skipping parameter "dataloaders" from "pytorch_lightning.Trainer.predict" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,639 - LightningArgumentParser - DEBUG - Skipping parameter "datamodule" from "pytorch_lightning.Trainer.predict" because of: Parameter requested to be skipped.
2024-09-14 01:10:19,640 - LightningArgumentParser - DEBUG - Loaded parser defaults: Namespace(config=None, subcommand=None)
2024-09-14 01:10:19,745 - LightningArgumentParser - DEBUG - Loaded parser defaults: Namespace(config=None, seed_everything=True, trainer=Namespace(accelerator='auto', strategy='auto', devices='auto', num_nodes=1, precision=None, logger=None, callbacks=None, fast_dev_run=False, max_epochs=None, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, limit_train_batches=None, limit_val_batches=None, limit_test_batches=None, limit_predict_batches=None, overfit_batches=0.0, val_check_interval=None, check_val_every_n_epoch=1, num_sanity_val_steps=None, log_every_n_steps=None, enable_checkpointing=None, enable_progress_bar=None, enable_model_summary=None, accumulate_grad_batches=1, gradient_clip_val=None, gradient_clip_algorithm=None, deterministic=None, benchmark=None, inference_mode=True, use_distributed_sampler=True, profiler=None, detect_anomaly=False, barebones=False, plugins=None, sync_batchnorm=False, reload_dataloaders_every_n_epochs=0, default_root_dir=None), ckpt_path=None)
Traceback (most recent call last):
  File "/home/ssd_satoshi/projects/hand/test.py", line 43, in <module>
    main()
  File "/home/ssd_satoshi/projects/hand/test.py", line 40, in main
    cli = LightningCLI(RandomModel)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/pytorch_lightning/cli.py", line 383, in __init__
    self.parse_arguments(self.parser, args)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/pytorch_lightning/cli.py", line 534, in parse_arguments
    self.config = parser.parse_args(args)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_deprecated.py", line 123, in patched_parse
    cfg = parse_method(*args, _skip_check=_skip_check, **kwargs)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_core.py", line 396, in parse_args
    cfg, unk = self.parse_known_args(args=args, namespace=cfg)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_core.py", line 264, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2049, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2026, in consume_positionals
    take_action(action, args)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_actions.py", line 651, in __call__
    namespace[subcommand] = subparser.parse_args(arg_strings, namespace=subnamespace, **kwargs)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_deprecated.py", line 123, in patched_parse
    cfg = parse_method(*args, _skip_check=_skip_check, **kwargs)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_core.py", line 396, in parse_args
    cfg, unk = self.parse_known_args(args=args, namespace=cfg)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_core.py", line 264, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2067, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2007, in consume_optional
    take_action(action, args, option_string)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 1099, in __call__
    parser.print_help()
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2555, in print_help
    self._print_message(self.format_help(), file)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_core.py", line 1254, in format_help
    help_str = super().format_help()
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 2539, in format_help
    return formatter.format_help()
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 283, in format_help
    help = self._root_section.format_help()
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 214, in format_help
    item_help = join([func(*args) for func, args in self.items])
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 214, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 214, in format_help
    item_help = join([func(*args) for func, args in self.items])
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 214, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/argparse.py", line 533, in _format_action
    help_text = self._expand_help(action)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_formatters.py", line 138, in _expand_help
    help_str = PercentTemplate(self._get_help_string(action)).safe_substitute(params)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_formatters.py", line 85, in _get_help_string
    help_str += action.extra_help()
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_typehints.py", line 641, in extra_help
    class_paths = get_all_subclass_paths(self._typehint)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_typehints.py", line 1214, in get_all_subclass_paths
    add_subclasses(arg)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_typehints.py", line 1193, in add_subclasses
    class_path = get_import_path(cl)
  File "/home/localstorage/miniconda3/envs/hand/lib/python3.9/site-packages/jsonargparse/_util.py", line 238, in get_import_path
    raise ValueError(f"Not possible to determine the import path for object {value}.")
ValueError: Not possible to determine the import path for object typing.Iterable[pytorch_lightning.loggers.logger.Logger].

Note that I enabled export JSONARGPARSE_DEBUG=true

apple2373 commented 1 month ago

I solved it by downgrading the jsonargparse to 4.29.0. Just do pip install jsonargparse==4.29.0 will do the job.

mauvilsa commented 1 month ago

This was fixed in pull request https://github.com/omni-us/jsonargparse/pull/578, and will be part of the next jsonargparse release.