Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.44k stars 3.39k forks source link

Support for AdaptDL #15119

Open pietrolesci opened 2 years ago

pietrolesci commented 2 years ago

Reporting from the idea-pool channel on slack, as discussed with @carmocca.


Hi there,

On the way to solve a OOM problem with dynamic batch sizes based on sequence length, I have just discovered AdaptDL. Might be an interesting library to support.

Some core features offered by AdaptDL are:

carmocca commented 2 years ago

Hi! Thanks for the request.

Note however that we would only be interested in integrating the training tricks they provide such as batch size and learning rate scaling features.

We already support easy and elastic scaling of cloud jobs via our Lightning.ai platform. You can find ready-to-use apps such as the PL app to launch your PL scripts, or the demo app to get started.