Open mattiaciollaro opened 10 months ago
@krrishdholakia do you think this request is feasible ?
Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker
What am i missing?
@mattiaciollaro is this PR still relevant?
Sorry for the delay guys.
Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker
https://docs.litellm.ai/docs/providers/aws_sagemaker seems to support SageMaker JumpStart models specifically.
I am thinking of a different situation where a model is already deployed via SageMaker and a reference to the inference endpoint name is available (as in here). In that case, how can we instruct pr-agent to leverage the LLM behind that pre-existing endpoint? I am not sure I see a way of doing this via https://docs.litellm.ai/docs/providers/aws_sagemaker
In the context of a POC with my team, the way we accomplished this was to hack the pr-agent's default AI handler (which is the LiteLLM AI handler) and use the sagemaker SDK (specifically, the HF predictor to make requests to the pre-existing SageMaker endpoint.
I imagine a cleaner solution would be to implement a dedicated AI handler for this usecase?
@mattiaciollaro is this PR still relevant?
I don't have a PR out for this, but yes: I think the feature request is still relevant :) My apologies again for the delay!
Context: assume a user has e.g. a pre-configured LLM inference endpoint in SageMaker (for example, a self-hosted Llama model as described here). It would be nice to be able to allow the user to configure pr-agent to leverage that endpoint e.g. by means of a dedicated AI handler.
Discord chat: https://discord.com/channels/1057273017547378788/1057273018084237344/1197261978591309884
cc: @krrishdholakia