deepjavalibrary / djl-demo

Demo applications showcasing DJL
https://demo.djl.ai
Apache License 2.0
298 stars 121 forks source link

update sample notebook for mixtral 8x7b quantized model deployment #427

Closed melanie531 closed 3 months ago

melanie531 commented 5 months ago

Issue #, if available: N/A Description of changes: This is a sample notebook demonstrates how to deploy a quantized mixtral 8x7b on SageMaker with a g5.12xlarge instance. It uses the latest 0.26.0 LMI container and also shows how to use SageMaker Inference recommender to do load testing.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.