This PR adds a new method called export_finetune to Client to support the SageMaker BYO finetuning.
This PR also adds a Jupyter Notebook to show how to export the customer's own finetuned merged weights to TensorRT-LLM engine, and deploy the endpoint from the exported TensorRT-LLM engine.
This PR has been tested from end to end: every cell in the Jupyter Notebook has been run and tested, all of which work well.
This PR adds a new method called
export_finetune
toClient
to support the SageMaker BYO finetuning.This PR also adds a Jupyter Notebook to show how to export the customer's own finetuned merged weights to TensorRT-LLM engine, and deploy the endpoint from the exported TensorRT-LLM engine.
This PR has been tested from end to end: every cell in the Jupyter Notebook has been run and tested, all of which work well.