Open existme opened 11 months ago
I just opened an issue on sagemaker itself because I think it's an issue with the sagemaker SDK that's limiting some versions.
Thank you for taking the time to create the issue :pray: I hope it gets the needed attention.
@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.
Thanks for adding the ticket! I am also blocked by this issue.
@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.
@existme not at the time
I came with the same problem.
Huggingface has released a newer version of the image which is accessible via sagemaker: 763104351884.dkr.ecr.eu-central-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.3-gpu-py310-cu121-ubuntu20.04-v1.0
This is not really an issue, but I couldn't find any other way to contact you. I was trying to follow your instructions on https://www.philschmid.de/sagemaker-deploy-mixtral and ended up in this repository.
I tried to follow the deployment instructions, but the deployment was not successful. I got the following error logs on the inference endpoint:
The HF image that I ended up using was 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.1-gpu-py310-cu121-ubuntu20.04
looking into TGI issues, and found this thread. It seems to be fixed by a commit mentioned in the thread. But I don't how can I get the latest DLC image of 1.3.3 for a sagemaker deployment, because when I specify the version in
image_uris.retrieve
or inget_huggingface_llm_image_uri
, it complains:I don't know the procedure for having the latest version ending up in aws-dkr or how we can use a custom-built DLC image when deploying to Sagemaker. Can you help in any way, or can you explain how your deployment works?
Thanks in advance