Closed abhirooptalasila closed 1 week ago
Hey @abhirooptalasila! I'm here to help you with any bugs, questions, or becoming a contributor. Let's squash those bugs together!
The LlamaIndex codebase includes a fallback mechanism that defaults to a specific LLM if the specified model is not available. If the provided model name is not found, the code attempts to use the default OpenAI model. If the OpenAI model cannot be loaded (e.g., due to an invalid API key), it raises a ValueError
with a message indicating the issue and suggesting to check the OPENAI_API_KEY
. Additionally, if the environment is set for testing, it uses a MockLLM
instead.
Here is the relevant part of the code:
def resolve_llm(
llm: Optional[LLMType] = None, callback_manager: Optional[CallbackManager] = None
) -> LLM:
if llm == "default":
# if testing return mock llm
if os.getenv("IS_TESTING"):
llm = MockLLM()
llm.callback_manager = callback_manager or Settings.callback_manager
return llm
# return default OpenAI model. If it fails, return LlamaCPP
try:
from llama_index.llms.openai import OpenAI
from llama_index.llms.openai.utils import validate_openai_api_key
llm = OpenAI()
validate_openai_api_key(llm.api_key)
except ImportError:
raise ImportError(
"`llama-index-llms-openai` package not found, "
"please run `pip install llama-index-llms-openai`"
)
except ValueError as e:
raise ValueError(
"\n******\n"
"Could not load OpenAI model. "
"If you intended to use OpenAI, please check your OPENAI_API_KEY.\n"
"Original error:\n"
f"{e!s}"
"\nTo disable the LLM entirely, set llm=None."
"\n******"
)
elif llm is None:
print("LLM is explicitly disabled. Using MockLLM.")
llm = MockLLM()
assert isinstance(llm, LLM)
llm.callback_manager = callback_manager or Settings.callback_manager
return llm
Additionally, the LlamaIndex codebase includes support for a default ModelScope model. If no specific model is provided, it defaults to using the qwen/Qwen-7B-Chat
model from ModelScope. This is defined in the ModelScopeLLM
class, which sets default values for the model name, revision, task type, and other parameters [1][2][3][4].
This fallback mechanism ensures that if the specified model is not available, the system will attempt to use a default model, which might explain why the text-bison
model works even if you don't have a PaLM 2 instance deployed.
To continue talking to Dosu, mention @dosu.
@dosu I got that, but I don't have any OpenAI keys in my env.
llm = Vertex(model="text-bison", temperature=0, additional_kwargs={}, project=credentials.project_id, credentials=credentials)
llm.complete("Hello this is a sample text").text
I still get a completed output when I try out the above code using the GCP JSON credentials.
Please check the llama_index/llms/vertex/base.py
file for class implementation details.
You are getting a completed output using the Vertex
class with the text-bison
model and GCP JSON credentials because the Vertex
class from the llama_index.legacy.llms.vertex
module is designed to work with Google Cloud's Vertex AI, not OpenAI. The credentials you provided in the form of a GCP JSON file are sufficient for authentication and interaction with Vertex AI.
Here is the relevant code snippet from the document:
from llama_index.llms.vertex import Vertex
from google.oauth2 import service_account
filename = "vertex-407108-37495ce6c303.json"
credentials: service_account.Credentials = (
service_account.Credentials.from_service_account_file(filename)
)
Vertex(
model="text-bison", project=credentials.project_id, credentials=credentials
)
This code shows that the Vertex
class is initialized with the text-bison
model using the provided GCP JSON credentials, which are used to authenticate and interact with Vertex AI. Therefore, OpenAI keys are not required in this context [1][2][3].
Additionally, the Vertex
class includes methods such as complete
and chat
that handle the interaction with Vertex AI models, ensuring that the provided credentials are used correctly for generating responses. This further confirms that OpenAI keys are unnecessary for this setup.
The Vertex
class is designed to support various models and configurations, including chat models, code models, and text generation models. It initializes the appropriate client based on the model specified and uses the provided credentials for authentication. This comprehensive setup ensures that the interaction with Vertex AI is seamless and does not require any OpenAI keys [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu Then how do I use a Llama 3 model instead?
To use a Llama 3 model with the Vertex
class in LlamaIndex, you can follow this example:
from llama_index.llms.vertex import Vertex
from google.oauth2 import service_account
# Load your GCP credentials
filename = "path/to/your/service-account.json"
credentials = service_account.Credentials.from_service_account_file(filename)
# Initialize the Vertex class with the desired Llama 3 model
llm = Vertex(
model="llama-3", # Replace with the specific Llama 3 model name
project=credentials.project_id,
credentials=credentials
)
# Use the complete method to generate a response
output = llm.complete("Say foo:")
print(output.text)
Replace "llama-3"
with the specific Llama 3 model you want to use. The complete
method is used to generate a response from the model [1].
To continue talking to Dosu, mention @dosu.
@dosu "llama-3" is not an available option in the Vertex class.
To use a Llama 3 model with the Vertex
class, you need to ensure that the model name you provide is supported. If "llama-3" is not an available option, you should use one of the supported model names listed in the TEXT_MODELS
, CODE_MODELS
, CODE_CHAT_MODELS
, or other relevant categories.
Here is an example of how to use a supported model with the Vertex
class:
from llama_index.llms.vertex import Vertex
# Set up necessary variables
credentials = {
"project_id": "INSERT_PROJECT_ID",
"api_key": "INSERT_API_KEY",
}
# Create an instance of the Vertex class with a supported model
llm = Vertex(
model="text-bison", # Replace with a supported model name
project=credentials["project_id"],
credentials=credentials,
)
# Access the complete method from the instance
response = llm.complete("Hello world!")
print(str(response))
If you need to use a specific model, ensure it is listed in the supported models or check if it can be added to the TEXT_MODELS
, CODE_MODELS
, or other relevant lists. If "llama-3" is not supported, you will need to choose an alternative model that is available.
The Vertex
class supports various models, and if the model name is not found, it raises a ValueError
[1][2].
To continue talking to Dosu, mention @dosu.
@logan-markewich Any idea how I can find out which model it is defaulting to? I don't have any other models deployed on GCP. I've compared outputs from local and online GCP, both are different.
I did go through that but as you must've noticed, the Llama models are not included in the models list. What can I do now?
Question Validation
Question
I'm following this page to integrate a Llama 3 endpoint into my existing workflow using chat engines. I've added my service account JSON credentials, and while trying to instantiate a Vertex instance, I can see that the only model options are
chat-bison, chat-bison-32k, chat-bison@001, ... etc.
. Using the defaulttext-bison
model name, I can create an instance, and the completion API works, too.But that shouldn't work. I don't have a PaLM 2 instance deployed. Is there any other logic making it default to an available LLM?