Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.42k stars 3.39k forks source link

Shark backend integration #11884

Open dan-garvey opened 2 years ago

dan-garvey commented 2 years ago

🚀 Feature

Integrate Shark as an accelerator backend for Lightning.

Motivation

Lightning users would benefit from a "performance first" backend option for running their models. Shark users would benefit from Lightning's clean UX-focused API and clean HW abstraction. Integration seems like a win-win.

Pitch

I'd like to integrate shark as an accelerator similar to how IPUs are implemented. A WIP PR should be incoming soon.

Additional context

Warning- not user friendly (hence the motivation for this issue)

SHARK repo: https://github.com/NodLabs/SHARK/tree/PerformanceDevBranch Benchmarks: https://github.com/powderluv/transformer-benchmarks SHARK examples: https://github.com/NodLabs/shark-samples/tree/main/examples

Examples of using torch mlir as a backend for pytorch can be found at: https://github.com/llvm/torch-mlir/blob/main/examples resnet jupyter notebook plain python Other models like bert can be generated using the heavydep testing instructions at top-level torch-mlir.

cc @borda @akihironitta @rohitgr7

ananthsub commented 2 years ago

some n00b questoins:

powderluv commented 2 years ago

yes it is for both inference and training. We would like to first integrate inference and get that working well and then integrate training. We have been able to show good improvements over DDP etc with what we call Fine Grained parallelism (now known as 3D parallelism). We are rewriting some of that to be based off IREE (previously it was just based on MLIR).

We have our own optimizers, Zero style improvements, Python-less deployment and wrappers around NCCL etc to support efficient training at scale. But all of that should be invisible to the end user ala Lightning style.