Open rafal-dudek opened 6 months ago
Expected Behavior
I would like to be able to use my Custom Model (e.g. fine-tuned foundation model) deployed on GCP Endpoint with Spring AI library.
API for Endpoints: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints Example for predict: https://us-central1-aiplatform.googleapis.com/v1/projects/49...78/locations/us-central1/endpoints/433...736:predict {"instances":[{"content":"Hello"}],"parameters":{"temperature":0.2,"maxOutputTokens":1024,"topP":0.8,"topK":40,"candidateCount":1}}
Current Behavior
Currently in VertexAI PaLM2 Chat and VertexAI Gemini Chat you can only use Foundation models: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-chat
Context
There is a similar issue for langchain4j: https://github.com/langchain4j/langchain4j/issues/440
There has been some work in this area to support models deployed on GKE. Will schedule for M2 in case it comes together in time.
Pointer to documentation: https://cloud.google.com/blog/products/application-development/choosing-a-self-hosted-or-managed-solution-for-ai-app-development
Expected Behavior
I would like to be able to use my Custom Model (e.g. fine-tuned foundation model) deployed on GCP Endpoint with Spring AI library.
API for Endpoints: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints Example for predict: https://us-central1-aiplatform.googleapis.com/v1/projects/49...78/locations/us-central1/endpoints/433...736:predict {"instances":[{"content":"Hello"}],"parameters":{"temperature":0.2,"maxOutputTokens":1024,"topP":0.8,"topK":40,"candidateCount":1}}
Current Behavior
Currently in VertexAI PaLM2 Chat and VertexAI Gemini Chat you can only use Foundation models: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-chat
Context
There is a similar issue for langchain4j: https://github.com/langchain4j/langchain4j/issues/440