continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
19.65k stars 1.72k forks source link

Connection for use sagemaker Endpoint #1130

Open theja0473 opened 7 months ago

theja0473 commented 7 months ago

Validations

Problem

Currently Ollama or any self hosted services only allowed my suggestion lets connect to end point of Sagemaker and serve Copilot use case

Solution

No response

sestinj commented 7 months ago

Hi @theja0473 can you share more about what API format you expect for the Sagemaker endpoint?

mikhail-khodorovskiy commented 5 months ago

We are also interested in hosting the model in Sagemaker. The format of the API is:

https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/my-model-id/invocations

For boto, the documentation of how to invoke the endpoint: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-runtime/client/invoke_endpoint.html

The credentials can be provided the same way as you have it implemented for bedrock: https://docs.continue.dev/reference/Model%20Providers/bedrock

tobiasbartsch commented 5 months ago

I am also very interested in hosting a model in Sagemaker. I think the main thing to add (as @mikhail-khodorovskiy pointed out) is credentials handling identical to how Bedrock is done.

mikhail-khodorovskiy commented 4 months ago

We actually were able to import the model we want into Bedrock where the model id but it still does not work: https://github.com/continuedev/continue/issues/1749

sestinj commented 2 months ago

Hey all! Wanted to check in on this: we had a number of PRs made for Bedrock custom models as well as a dedicated SageMaker provider: