spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.33k stars 855 forks source link

Cannot invoke Vertex AI, getting a 'not implemented' HTML page #1349

Closed bogdansolga closed 3 weeks ago

bogdansolga commented 2 months ago

Bug description When I invoke the Google Gemini LLM, using either the Spring AI wrappers or the native Vertex API (described here), I get a weird 'not implemented' HTML error:

An error has occurred - 'io.grpc.StatusRuntimeException: UNIMPLEMENTED: HTTP status code 404

invalid content-type: text/html; charset=UTF-8

headers: Metadata(: status=404, content-type=text/html; charset=UTF-8, referrer-policy=no-referrer, date=Thu, 12 Sep 2024 13: 04: 01 GMT, alt-svc=h3=": 443"; ma=2592000, h3-29=": 443"; ma=2592000, content-length=1621)

DATA-----------------------------

{margin: 0;padding: 0}html, code{font: 15px/22px arial, sans-serif}html{background: #fff;color: #222;padding: 15px}body{margin: 7% auto 0;max-width: 390px;min-height: 180px;padding: 30px 0 15px} > body{background: url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right: 205px}p{margin: 11px 0 22px;overflow: hidden}ins{color: #777;text-decoration: none}a img{border: 0}@media screen and (max-width: 772px){body{background: none;margin-top: 0;max-width: none;padding-right: 0}}#logo{background: url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left: -5px}@media only screen and (min-resolution: 192dpi){#logo{background: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio: 2){#logo{background: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size: 100% 100%}}#logo{display: inline-block;height: 54px;width: 150px}

404. That’s an error.

The requested URL /google.cloud.aiplatform.v1.PredictionService/GenerateContent was not found on this server. That’s all we know.

Environment

Steps to reproduce

  1. Configure the Vertex AI Gemini chat, as described here

Expected behavior A query performed using the ChatClient should work, as it is working (at least) for ChatGPT and Claude.

Minimal Complete Reproducible example Here

ddobrin commented 1 month ago

Hi @bogdansolga do you wish to use one of the following samples from this codelab and test it out: link ?

Just set the project id, region and model name.

bogdansolga commented 1 month ago

Thank you @ddobrin , I will check ASAP 🙂 I certainly hope the samples will work.

asaikali commented 3 weeks ago

@bogdansolga were you able to resolve your issue.

bogdansolga commented 3 weeks ago

@asaikali and @ddobrin - not at all. Went through Google's (overly complicated and complexified) process of setting up the Application Default Credentials, activated and enabled some settings in Google Cloud (way too many and complex to remember), ran the ConversationExample and got exactly the same error that I had... almost two months ago.

I have attached the stack trace, in case it helps in any way. I am more than disappointed about the complexity of the Gemini integration; not blaming the Spring AI support, in any way.

gemini invocation stack trace.txt

bogdansolga commented 3 weeks ago

@ddobrin - unfortunately, the usage of Gemini is far more complicated than 'Just set the project id, region and model name.' Besides setting up the Application Default Credentials, installing the Google Cloud SDK, activating and enabling the APIs and Services for that project, ... and it still seems to be not enough.

I might be using an incorrect region; how can I know which region I can / should use? where to configure it?

And, FFS, why is the whole process so overly complexified? Thousands of kudos to OpenAI and Anthropic for requiring just an API key, which can be obtained in 2 minutes (at most)

ddobrin commented 3 weeks ago

@bogdansolga

  1. The sample you have provided to start the issue works if you set the region correctly: spring.ai.vertex.ai.gemini.location=europe-west3

The stacktrace indicates it clearly:

The requested URL /v1/projects/ai-query-hub/locations/europe-west-3/publishers/google/models/gemini-1.5-pro-001:generateContent was not found on this server.

See the documentation here: Vertex AI locations

  1. The provided samples are also working fine with the correct region

  2. I'd like to recommend that you revisit the Google docs available for Gemini API Quickstart including how to retrieve an API key and access Gemini models with an API key instead of the Auth token.

While the Vertex AI Java SDK, on which the Vertex Gemini model is based, supports at this time only token based auth, key based access will be resolved by this issue

cc: @asaikali

bogdansolga commented 3 weeks ago

Thank you, @ddobrin

With all the respect - how could I have known / found the possible locations in the ocean of information and links that Vertex / Google requires, especially considering that Open ai and Anthropic require... none?

bogdansolga commented 3 weeks ago

And regarding the obtaining of an API key - this is absolutely the first mention of the possibility of using an API key.

There was absolutely no reference to this possibility in the reference Spring AI documentation

ddobrin commented 3 weeks ago

This is a fast paced environment, it evolves quickly, new features are being added to SDKs and AI orchestration frameworks every single day/week/month; please follow the space

bogdansolga commented 3 weeks ago

Given the amount of time that Vertex and Google require to test their LLM, considering the fact that it is arguably the 3rd or 4th in the world, is an order of magnitude higher than the benefits, I will easily say no, thanks.

Case closed, for me.

bogdansolga commented 3 weeks ago

In Gemini's own words (and quasi self awareness):

You're right to be frustrated. Setting up Gemini, especially for the first time, can feel like navigating a maze compared to the smoother onboarding experiences of OpenAI and Anthropic.

Here's a breakdown of why this complexity exists and what Google might be thinking:

Why Gemini Setup Can Be Painful:

Why OpenAI and Anthropic Seem Easier:

What Google Might Be Thinking:

The Bottom Line:

You're not alone in finding Gemini's setup process challenging. While Google's focus on security and enterprise features is understandable, they need to strike a better balance between power and usability. Hopefully, they will streamline the onboarding process and documentation to make Gemini more accessible to a wider audience.