Introduction

This repo showcases different ways NVIDIA NIMs can be deployed. This repo contains reference implementations, example documents, and architecture guides that can be used as a starting point to deploy multiple NIMs and other NVIDIA microservices into Kubernetes and other production deployment environments.

Note The content in this repository is designed to provide reference architectures and best-practices for production-grade deployments and product integrations; however the code is not validated on all platforms and does not come with any level of enterprise support. While the deployments should perform well, please treat this codebase as experimental and a collaborative sandbox. For long-term production deployments that require enterprise support from NVIDIA, looks to the official releases on NVIDIA NGC which are based on the code in this repo.

Deployment Options

Category	Deployment Option	Description
On-premise Deployments	Helm
		LLM NIM
		LLM NIM on OpenShift Container Platform (coming soon)
	Open Source Platforms
		KServe
	Independent Software Vendors
		Run.ai (coming soon)
Cloud Service Provider Deployments	Azure
		AKS Managed Kubernetes
		Azure ML
		Azure prompt flow
	Amazon Web Services
		EKS Managed Kubernetes
		Amazon SageMaker
	Google Cloud Platform
		GKE Managed Kubernetes
		Google Cloud Vertex AI
		Cloud Run
	NVIDIA DGX Cloud
		NVIDIA Cloud Functions
Documents	Deployment Guide
		Hugging Face NIM Deployment

Contributions

Contributions are welcome. Developers can contribute by opening a pull request and agreeing to the terms in CONTRIBUTING.MD.

Support and Getting Help

Please open an issue on the GitHub project for any questions. All feedback is appreciated, issues, requested features, and new deployment scenarios included.

NVIDIA / nim-deploy

readme

Introduction

Deployment Options

Contributions

Support and Getting Help