defenseunicorns / leapfrogai

Production-ready Generative AI for local, cloud native, airgap, and edge deployments.
https://leapfrog.ai
Apache License 2.0
257 stars 28 forks source link

feat: jetpack-l4t based llama.cpp container #846

Open gerred opened 3 months ago

gerred commented 3 months ago

User Story

As a user I want to run accelerated models on Jetson AGX Orins So that I can take advantage of the device's NPUs

Additional context

Compiling using the l4t-jetpack nvidia container: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-jetpack Example of building containerized with all build tools: https://github.com/dusty-nv/jetson-containers/tree/dev

Containers don't seem to need --gpus=all, and the Jetson + Jetpack comes with it's own nvidia container toolkit ready to go. It looks like this is a combination of the CPU UDS bundle with the inferencing containers being built on the jetpack image.

gerred commented 3 months ago

I have a Jetson available on Tailscale as needed for testing, need to upgrade the storage on it before we can iterate.