Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.51k stars 3.39k forks source link

Raise Misconfiguration exception for `xpus=0/ []/ "0"` #12638

Closed kaushikb11 closed 1 year ago

kaushikb11 commented 2 years ago

Motivation

After #12410, we don't fall back to CPUAccelerator when the user does

Trainer(accelerator="xpu", devices=0/ []/ "0")

It is not consistent with the below behavior

# Falls back to `CPUAccelerator`
Trainer(xpus=0/"0"/[])

Pitch

We need to have a consistent behavior for the accelerator and devices API, and the devices specific flags.

# Raises Misconfiguration to be consistent with `(accelerator="gpu", devices=0/"0"/[])`
Trainer(gpus=0/"0"/[])

If you enjoy Lightning, check out our other projects! ⚡

cc @tchaton @justusschock @awaelchli @borda @kaushikb11 @rohitgr7 @akihironitta

awaelchli commented 2 years ago

I agree with this.

The previous syntax gpus=0 allowed the expression of "I don't want to use any GPUS" which was interpreted as "I want to stick to the CPU". However, with the new syntax the combination accelerator="gpu", devices=0 is contradictory. Falling back to CPU would be incorrect as GPU was specifically requested.

For this edge case, consistency with the xpus notation cannot be achieved, and this is perfectly fine.

awaelchli commented 1 year ago

Closing, no longer applies. We are raising an error consistently for all accelerators now: https://github.com/Lightning-AI/lightning/blob/97020bf8d7a88ca5195534b8585a5ef53f1ce6cb/src/lightning/pytorch/trainer/connectors/accelerator_connector.py#L327-L336