awslabs / llm-hosting-container

Large Language Model Hosting Container
Apache License 2.0
75 stars 30 forks source link

How can I delpoy a model with AWS S3 and without downloading model from hunggingface via TGI image on Sagemaker? #27

Open weiZhenkun opened 1 year ago

ramkrithik commented 11 months ago

Yes, we can define model data from s3 in HuggingFaceModel initialisation and replace the "HF_MODEL_ID" to "/opt/ml/model". But before that you should convert the model weights to safe tensors by loading the model and resave with safe serialisation (There will be better solutions than this).

hub = {
    'HF_MODEL_ID':'/opt/ml/model',...
}
huggingface_model = HuggingFaceModel( 
model_data = "s3://x/model.tar.gz"
env = hub
)