Cannot invoke Vertex AI, getting a 'not implemented' HTML page

bogdansolga commented 2 months ago

Bug description When I invoke the Google Gemini LLM, using either the Spring AI wrappers or the native Vertex API (described here), I get a weird 'not implemented' HTML error:

An error has occurred - 'io.grpc.StatusRuntimeException: UNIMPLEMENTED: HTTP status code 404

invalid content-type: text/html; charset=UTF-8

headers: Metadata(: status=404, content-type=text/html; charset=UTF-8, referrer-policy=no-referrer, date=Thu, 12 Sep 2024 13: 04: 01 GMT, alt-svc=h3=": 443"; ma=2592000, h3-29=": 443"; ma=2592000, content-length=1621)

DATA-----------------------------

{margin: 0;padding: 0}html, code{font: 15px/22px arial, sans-serif}html{background: #fff;color: #222;padding: 15px}body{margin: 7% auto 0;max-width: 390px;min-height: 180px;padding: 30px 0 15px} > body{background: url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right: 205px}p{margin: 11px 0 22px;overflow: hidden}ins{color: #777;text-decoration: none}a img{border: 0}@media screen and (max-width: 772px){body{background: none;margin-top: 0;max-width: none;padding-right: 0}}#logo{background: url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left: -5px}@media only screen and (min-resolution: 192dpi){#logo{background: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio: 2){#logo{background: url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size: 100% 100%}}#logo{display: inline-block;height: 54px;width: 150px}

404. That’s an error.

The requested URL /google.cloud.aiplatform.v1.PredictionService/GenerateContent was not found on this server. That’s all we know.

Environment

Spring AI version: 1.0.0.M2
Java version: 21
which vector store you use: pgVector (not relevant in this issue, though)

Steps to reproduce

Configure the Vertex AI Gemini chat, as described here

Expected behavior A query performed using the ChatClient should work, as it is working (at least) for ChatGPT and Claude.

Minimal Complete Reproducible example Here

ddobrin commented 1 month ago

Hi @bogdansolga do you wish to use one of the following samples from this codelab and test it out: link ?

Just set the project id, region and model name.

bogdansolga commented 1 month ago

Thank you @ddobrin , I will check ASAP 🙂 I certainly hope the samples will work.

asaikali commented 3 weeks ago

@bogdansolga were you able to resolve your issue.

bogdansolga commented 3 weeks ago

@asaikali and @ddobrin - not at all. Went through Google's (overly complicated and complexified) process of setting up the Application Default Credentials, activated and enabled some settings in Google Cloud (way too many and complex to remember), ran the ConversationExample and got exactly the same error that I had... almost two months ago.

I have attached the stack trace, in case it helps in any way. I am more than disappointed about the complexity of the Gemini integration; not blaming the Spring AI support, in any way.

gemini invocation stack trace.txt

bogdansolga commented 3 weeks ago

@ddobrin - unfortunately, the usage of Gemini is far more complicated than 'Just set the project id, region and model name.' Besides setting up the Application Default Credentials, installing the Google Cloud SDK, activating and enabling the APIs and Services for that project, ... and it still seems to be not enough.

I might be using an incorrect region; how can I know which region I can / should use? where to configure it?

And, FFS, why is the whole process so overly complexified? Thousands of kudos to OpenAI and Anthropic for requiring just an API key, which can be obtained in 2 minutes (at most)

ddobrin commented 3 weeks ago

@bogdansolga

The sample you have provided to start the issue works if you set the region correctly: spring.ai.vertex.ai.gemini.location=europe-west3

The stacktrace indicates it clearly:

The requested URL /v1/projects/ai-query-hub/locations/europe-west-3/publishers/google/models/gemini-1.5-pro-001:generateContent was not found on this server.

See the documentation here: Vertex AI locations

The provided samples are also working fine with the correct region
I'd like to recommend that you revisit the Google docs available for Gemini API Quickstart including how to retrieve an API key and access Gemini models with an API key instead of the Auth token.

While the Vertex AI Java SDK, on which the Vertex Gemini model is based, supports at this time only token based auth, key based access will be resolved by this issue

cc: @asaikali

bogdansolga commented 3 weeks ago

Thank you, @ddobrin

With all the respect - how could I have known / found the possible locations in the ocean of information and links that Vertex / Google requires, especially considering that Open ai and Anthropic require... none?

bogdansolga commented 3 weeks ago

And regarding the obtaining of an API key - this is absolutely the first mention of the possibility of using an API key.

There was absolutely no reference to this possibility in the reference Spring AI documentation

ddobrin commented 3 weeks ago

This is a fast paced environment, it evolves quickly, new features are being added to SDKs and AI orchestration frameworks every single day/week/month; please follow the space

bogdansolga commented 3 weeks ago

Given the amount of time that Vertex and Google require to test their LLM, considering the fact that it is arguably the 3rd or 4th in the world, is an order of magnitude higher than the benefits, I will easily say no, thanks.

Case closed, for me.

bogdansolga commented 3 weeks ago

In Gemini's own words (and quasi self awareness):

You're right to be frustrated. Setting up Gemini, especially for the first time, can feel like navigating a maze compared to the smoother onboarding experiences of OpenAI and Anthropic.

Here's a breakdown of why this complexity exists and what Google might be thinking:

Why Gemini Setup Can Be Painful:

Google Cloud Platform's Vastness: Gemini is deeply integrated with Google Cloud Platform (GCP), which is a sprawling ecosystem. While this offers power and flexibility, it also means navigating numerous services, authentication methods, and configurations.
Security First Approach: GCP prioritizes granular security controls. This is generally a good thing, but it can lead to a multi-layered authentication process (like Application Default Credentials) that feels cumbersome for new users.
Enterprise Focus: GCP, and by extension Gemini, caters to a wide range of users, including large enterprises with complex security and infrastructure needs. This can make the setup process less beginner-friendly.

Why OpenAI and Anthropic Seem Easier:

Streamlined Onboarding: OpenAI and Anthropic have focused on making their APIs accessible with simpler authentication keys and less configuration overhead. This prioritizes ease of use, especially for developers and researchers just starting out.
Purpose-Built Platforms: These companies have built their platforms specifically for AI model access, allowing for a more streamlined and focused user experience.

What Google Might Be Thinking:

Existing Infrastructure: Google leverages its existing GCP infrastructure for Gemini, which might explain the reliance on established (but complex) authentication and configuration methods.
Gradual Rollout: Gemini is still relatively new. Google might be prioritizing functionality and integration over simplifying the initial setup, with plans to improve the user experience over time.
Diverse Use Cases: The complexity might stem from Google's aim to cater to a wide range of use cases, from individual developers to large enterprises with specific security and compliance requirements.

The Bottom Line:

You're not alone in finding Gemini's setup process challenging. While Google's focus on security and enterprise features is understandable, they need to strike a better balance between power and usability. Hopefully, they will streamline the onboarding process and documentation to make Gemini more accessible to a wider audience.

spring-projects / spring-ai

Cannot invoke Vertex AI, getting a 'not implemented' HTML page #1349