I deploy a model mistralai_mistral-7b-instruct-v0_2 via Vertex AI.
And, I need to invoke the model with this very simple code.
import asyncio
import sys
from pprint import pprint
from langchain.chains.llm import LLMChain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompt_values import StringPromptValue
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
from langchain_google_vertexai import VertexAIModelGarden, VertexAI
def main():
llm = VertexAIModelGarden(
project="XXXX",
location="europe-west2",
endpoint_id="0000000000",
)
print("-------------")
prompt = ChatPromptTemplate.from_template("tell me a short joke about {topic}")
chain = prompt | llm | StrOutputParser()
final_result=chain.invoke({"topic": "ice cream"})
print(final_result)
main()
The result is
-------------
Prompt:
Human: tell me a short joke about ice cream
Output:
Anytime! Here's a classic, light-hearted ice
Process finished with exit code 0
I deploy a model mistralai_mistral-7b-instruct-v0_2 via Vertex AI. And, I need to invoke the model with this very simple code.
The result is
An echo of the prompt, and only 16 tokens.
Where is the error?