Open dan-garvey opened 2 years ago
some n00b questoins:
yes it is for both inference and training. We would like to first integrate inference and get that working well and then integrate training. We have been able to show good improvements over DDP etc with what we call Fine Grained parallelism (now known as 3D parallelism). We are rewriting some of that to be based off IREE (previously it was just based on MLIR).
We have our own optimizers, Zero style improvements, Python-less deployment and wrappers around NCCL etc to support efficient training at scale. But all of that should be invisible to the end user ala Lightning style.
🚀 Feature
Integrate Shark as an accelerator backend for Lightning.
Motivation
Lightning users would benefit from a "performance first" backend option for running their models. Shark users would benefit from Lightning's clean UX-focused API and clean HW abstraction. Integration seems like a win-win.
Pitch
I'd like to integrate shark as an accelerator similar to how IPUs are implemented. A WIP PR should be incoming soon.
Additional context
Warning- not user friendly (hence the motivation for this issue)
SHARK repo: https://github.com/NodLabs/SHARK/tree/PerformanceDevBranch Benchmarks: https://github.com/powderluv/transformer-benchmarks SHARK examples: https://github.com/NodLabs/shark-samples/tree/main/examples
Examples of using torch mlir as a backend for pytorch can be found at: https://github.com/llvm/torch-mlir/blob/main/examples resnet jupyter notebook plain python Other models like bert can be generated using the heavydep testing instructions at top-level torch-mlir.
cc @borda @akihironitta @rohitgr7