Issue #, if available:
N/A
Description of changes:
This is a sample notebook demonstrates how to deploy a quantized mixtral 8x7b on SageMaker with a g5.12xlarge instance. It uses the latest 0.26.0 LMI container and also shows how to use SageMaker Inference recommender to do load testing.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Issue #, if available: N/A Description of changes: This is a sample notebook demonstrates how to deploy a quantized mixtral 8x7b on SageMaker with a g5.12xlarge instance. It uses the latest 0.26.0 LMI container and also shows how to use SageMaker Inference recommender to do load testing.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.