Breaking change v1.0.3 -> v1.0.4 with ChatVertexAI - 503 Getting metadata from plugin failed with error: Bad Request

ventz commented 1 month ago

Hi,

Noticed that when going from:

langchain-google-vertexai     1.0.3

to

langchain-google-vertexai     1.0.4

(due to going from langchain 0.1.20 -> langchain 0.2.0)

That working code with ChatVertexAI broke.

Output:

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ServiceUnavailable: 503 Getting metadata from plugin failed with error: ('invalid_grant: Bad Request', {'error': 'invalid_grant', 'error_description': 'Bad Request'}).

Specifically, this works with:

langchain                     0.1.20
langchain-community           0.0.38
langchain-core                0.1.52
langchain-google-vertexai     1.0.3
langchain-openai              0.1.7
langchain-text-splitters      0.0.2

And it does NOT work with:

langchain                     0.2.0
langchain-community           0.0.34
langchain-core                0.2.0
langchain-google-vertexai     1.0.4
langchain-openai              0.1.4
langchain-text-splitters      0.2.0

Not seeing anything documentation updates, but I would imagine this should not break?

ventz commented 1 month ago

Updated information:

It works in 1.0.2 and 1.0.3 -- and it breaks in 1.0.4

lkuligin commented 1 month ago

could you share any reproducible snippet, please?

ventz commented 1 month ago

@lkuligin No problem - thanks:

import dotenv, os, json

from google.oauth2 import service_account

from langchain_google_vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser

class GeminiPro():
    def __init__(self):
        dotenv.load_dotenv()
        GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID")
        GCP_REGION = os.environ.get("GCP_REGION")
        GCP_CREDENTIALS_JSON = os.environ.get("GCP_CREDENTIALS_JSON")

        credentials = service_account.Credentials.from_service_account_info(json.loads(GCP_CREDENTIALS_JSON))
        scoped_creds = credentials.with_scopes(["https://www.googleapis.com/auth/cloud-platform"])

        self.llm = ChatVertexAI(
                model_name="gemini-pro",
                convert_system_message_to_human=False,
                project=GCP_PROJECT_ID,
                location=GCP_REGION,
                credentials=scoped_creds,
                max_output_tokens=8192,
                temperature=0.2,
        )

        self.output_parser = StrOutputParser()

        self.prompt_template = ChatPromptTemplate.from_messages([
            ("system", "You are a helpful assistant. Answer all questions to the best of your ability."),
            MessagesPlaceholder(variable_name="messages"),
        ])

        chain = self.prompt_template | self.llm | self.output_parser

        response = chain.invoke({
            "messages": [
                HumanMessage(content=prompt),
            ],
        })
        print(response)

ventz commented 1 month ago

@lkuligin Just wanted to see if you had a chance to look into this?

Adi8885 commented 1 month ago

@ventz Can you please re-share code where the variable prompt is defined?

        "messages": [
            HumanMessage(content=prompt),
        ],

gmogr commented 1 month ago

Hi @ventz , I checked your code locally.

With the 1.0.4 version of the lib is seems to work.

pip show langchain-google-vertexai
Name: langchain-google-vertexai Version: 1.0.4

ventz commented 1 month ago

@gmogr @lkuligin @Adi8885 Here is a drop-in test I just wrote -- created a new python project+venv, and it seems fails. (If I go back to the previous langchain-google-vertexai 1.0.3 + langchain 0.1.20 it works)

% python test.py        
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ServiceUnavailable: 503 Getting metadata from plugin failed with error: ('invalid_grant: Bad Request', {'error': 'invalid_grant', 'error_description': 'Bad Request'}).

Here is a complete "drop in" test:

# test.py
import dotenv, os, json

from google.oauth2 import service_account

from langchain_google_vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser

class GeminiPro():
    def __init__(self, prompt):
        dotenv.load_dotenv()
        GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID")
        GCP_REGION = os.environ.get("GCP_REGION")
        GCP_CREDENTIALS_JSON = os.environ.get("GCP_CREDENTIALS_JSON")

        credentials = service_account.Credentials.from_service_account_info(json.loads(GCP_CREDENTIALS_JSON))
        scoped_creds = credentials.with_scopes(["https://www.googleapis.com/auth/cloud-platform"])

        self.llm = ChatVertexAI(
                model_name="gemini-pro",
                convert_system_message_to_human=False,
                project=GCP_PROJECT_ID,
                location=GCP_REGION,
                credentials=scoped_creds,
                max_output_tokens=8192,
                temperature=0.2,
        )

        self.output_parser = StrOutputParser()

        self.prompt_template = ChatPromptTemplate.from_messages([
            ("system", "You are a helpful assistant. Answer all questions to the best of your ability."),
            MessagesPlaceholder(variable_name="messages"),
        ])

        chain = self.prompt_template | self.llm | self.output_parser

        response = chain.invoke({
            "messages": [
                HumanMessage(content=prompt),
            ],
        })
        print(response)

test = GeminiPro("What llm are you?")
print(test)

% cat requirements.txt        
python-dotenv
langchain
langchain-google-vertexai 
google-cloud-aiplatform

% pip list | grep langchain
langchain                     0.2.1
langchain-core                0.2.1
langchain-google-vertexai     1.0.4
langchain-text-splitters      0.2.0

% pip list | grep google   
google-api-core               2.19.0
google-auth                   2.29.0
google-cloud-aiplatform       1.52.0
google-cloud-bigquery         3.23.1
google-cloud-core             2.4.1
google-cloud-resource-manager 1.12.3
google-cloud-storage          2.16.0
google-crc32c                 1.5.0
google-resumable-media        2.7.0
googleapis-common-protos      1.63.0
grpc-google-iam-v1            0.13.0
langchain-google-vertexai     1.0.4

Using a .env file:

export GCP_PROJECT_ID="<redacted>"
export GCP_REGION="us-central1"
export GCP_CREDENTIALS_JSON='{
  "type": "service_account",
  "project_id": "<redacted>",
  ...

Could you test the above in new environment/clean python venv?

We have a few different projects (different projects/envs, different teams) with a similar setup, and all of them started to fail after the upgrade in dev on our end.

ventz commented 1 month ago

@gmogr @lkuligin @Adi8885 Any luck replicating this?

From my side - a co-worker ran into the same issue on their machine, so assuming not something to my environment.

gmogr commented 1 month ago

Hi @ventz sorry for the delay, I'll be further investigating today.

ventz commented 1 month ago

Hi @gmogr No problem at all - thank you!

Subham0793 commented 4 weeks ago

@gmogr @ventz Were you able to fine the issue / resolution ?

ventz commented 3 weeks ago

@gmogr Thanks. Sending email (feel free to delete your comment so your email is not available)

gmogr commented 3 weeks ago

@ventz don't see the message from you. Did commit from @lkuligin fix the issue ?

ventz commented 3 weeks ago

@gmogr see email title:

ChatVertexAI - v.1.0.3 → v1.0.4 (503 Getting metadata from plugin failed with error: Bad Request)

just sent a "Hello - email from Ventz"

Let me try the commit from @lkuligin

ventz commented 3 weeks ago

@gmogr @lkuligin No error after applying the commit, however seeing another problem.

Ex: After running it 6 times, I got an output (see > RESPONSE:) only twice (the rest comes back with no output):

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x10a9bde10>, temperature=0.2, max_output_tokens=8192)] last=StrOutputParser()
> RESPONSE:

vs

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x12b123790>, temperature=0.2, max_output_tokens=8192)] last=StrOutputParser()
> RESPONSE: I am a large language model, trained by Google.

Here is the test code:

# test.py
import dotenv, os, json

from google.oauth2 import service_account

from langchain_google_vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser

dotenv.load_dotenv()
GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID")
GCP_REGION = os.environ.get("GCP_REGION")
GCP_CREDENTIALS_JSON = os.environ.get("GCP_CREDENTIALS_JSON")

credentials = service_account.Credentials.from_service_account_info(json.loads(GCP_CREDENTIALS_JSON))
scoped_creds = credentials.with_scopes(["https://www.googleapis.com/auth/cloud-platform"])

llm = ChatVertexAI(
        model_name="gemini-pro",
        convert_system_message_to_human=False,
        project=GCP_PROJECT_ID,
        location=GCP_REGION,
        credentials=scoped_creds,
        max_output_tokens=8192,
        temperature=0.2,
)

output_parser = StrOutputParser()

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer all questions to the best of your ability."),
    MessagesPlaceholder(variable_name="messages"),
])

chain = prompt_template | llm | output_parser
print(chain)

response = chain.invoke({
    "messages": [
        HumanMessage(content="What llm are you"),
    ],
})
print(f"> RESPONSE: {response}")

If it's helpful, this is auth-ed with a .env file:

GCP_PROJECT_ID="<redacted>"
GCP_REGION="us-central1"
GCP_CREDENTIALS_JSON='{
 "type": "service_account",
  "project_id": "<redacted>",
  "private_key_id":  ...
  ...
}

Here are 6 new test runs, with a lowered max tokens (max_output_tokens=8000) - you can see that only 2 outputted:

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x105026dd0>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()
> RESPONSE: I am a large language model, trained by Google.

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x10512fb90>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()
> RESPONSE: I am a large language model, trained by Google.

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x1041dae90>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()
> RESPONSE: 

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x1064ab0d0>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()
> RESPONSE: 

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x100a43b90>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()
> RESPONSE: 

% python test.py
first=ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant. Answer all questions to the best of your ability.')), MessagesPlaceholder(variable_name='messages')]) middle=[ChatVertexAI(project='<redacted>', model_name='gemini-pro', model_family=<GoogleModelFamily.GEMINI: '1'>, full_model_name='projects/<redacted>/locations/us-central1/publishers/google/models/gemini-pro', client_options=ClientOptions: {'api_endpoint': 'us-central1-aiplatform.googleapis.com', 'client_cert_source': None, 'client_encrypted_cert_source': None, 'quota_project_id': None, 'credentials_file': None, 'scopes': None, 'api_key': None, 'api_audience': None, 'universe_domain': None}, default_metadata=(), credentials=<google.oauth2.service_account.Credentials object at 0x1042f6e10>, temperature=0.2, max_output_tokens=8000)] last=StrOutputParser()

lkuligin commented 3 weeks ago

That looks like a separate problem, let's discuss it in a separate issue, please. Closing this for now.

ventz commented 3 weeks ago

Sounds good. Thanks. Just linking new issue here for anyone following:

https://github.com/langchain-ai/langchain-google/issues/297

langchain-ai / langchain-google

Breaking change v1.0.3 -> v1.0.4 with ChatVertexAI - 503 Getting metadata from plugin failed with error: Bad Request #236