Refactor trainer._log_device_info() method and warnings

four4fish commented 2 years ago

Proposed refactor

Raised in discussion by @ananthsub and @justusschock in #11001 1/n Generalize internal checks for Accelerator in Trainer - remove trainer._device_type

Motivation

_log_device_info() in trainer is too verbose and message are not helpful https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L1630-L1661

Accelerator/device selection Only happens in accelerator_connector, and related warning and logging should happen in accelerator_connector as well.

The warning and logic can be merged into select_accelerator_type()

Pitch

Simplify the log warning in trainer._log_device_info() and make it less verbose, remove unnecessary warnings, reduce log level from warning to debug

Move _log_device_info() to accelerator_connector and call at the end of the init(), or merge the logic into accelerator_connector

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @justusschock @awaelchli @akihironitta @carmocca @edward-io @ananthsub @kaushikb11 @ninginthecloud

kaushikb11 commented 2 years ago

related warning and logging should happen in accelerator_connector as well.

@four4fish It was initially part of the Accelerator Connector. But we had come across a use case where log_device_info needed to be overridden with custom logic. Hence, we introduced it as a method in Trainer for this purpose.

four4fish commented 2 years ago

@kaushikb11 thank you for sharing the background! Could you share more detail about the use cases? Do you mean user want to override log_device_info()? There was no issue and I couldn't see any details from the PR Is this still required?

kaushikb11 commented 2 years ago

Sure! We could have frameworks building on top of Lightning Trainer. For instance, they introduced a new IPEX accelerator and added modifications for it. _log_device_info would help them customize device info logging as well.

four4fish commented 2 years ago

Interesting use case! But seems user extended trainer class with Accelerator() and Plugins selection logic, then the accelerator, plugins and args been passed into super()init, which will call our accelerator_connector()? In that case, the log_device could be in either place? Am I missing something?

Lightning-AI / pytorch-lightning