Closed DavidLMS closed 11 months ago
π€
It's great to see that you've identified the issue and even found a potential solution by explicitly adding the api_base argument in the chat_models/litellm.py file. Your contribution would be valuable to the LangChain community. If you're willing to create a pull request with your proposed fix, it would be a fantastic addition to the framework. Thank you for considering contributing to LangChain!
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot I think that the solution I have proposed should not be the definitive one, it is a temporary fix because the api_base should be sent correctly through kwargs.
π€
Thank you for bringing this issue to our attention. It seems like the api_base
parameter is not being correctly propagated in the ChatLiteLLM
class. This is likely causing the issue you're experiencing with the model loading error.
The api_base
parameter in the ChatLiteLLM
class is used to set the base URL for the API that the client will interact with. It is handled in the _client_params
property method of the ChatLiteLLM
class. Here is the relevant code snippet:
@property
def _client_params(self) -> Dict[str, Any]:
"""Get the parameters used for the openai client."""
set_model_value = self.model
if self.model_name is not None:
set_model_value = self.model_name
self.client.api_base = self.api_base
self.client.organization = self.organization
creds: Dict[str, Any] = {
"model": set_model_value,
"force_timeout": self.request_timeout,
}
return {**self._default_params, **creds}
In this method, self.client.api_base
is set to self.api_base
. This means that the api_base
parameter of the ChatLiteLLM
instance is used to set the api_base
attribute of the client. This attribute is typically used to specify the base URL for the API that the client will interact with.
However, it seems like the api_base
parameter is not being passed to the qianfan.ChatCompletion
client in the validate_environment
method of the QianfanChatEndpoint
class. This could be the root cause of the issue.
To resolve this, you can modify the validate_environment
method to include the api_base
parameter when initializing the qianfan.ChatCompletion
client. Here's how you can do it:
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
values["qianfan_ak"] = get_from_dict_or_env(
values,
"qianfan_ak",
"QIANFAN_AK",
)
values["qianfan_sk"] = get_from_dict_or_env(
values,
"qianfan_sk",
"QIANFAN_SK",
)
params = {
"ak": values["qianfan_ak"],
"sk": values["qianfan_sk"],
"model": values["model"],
"stream": values["streaming"],
"api_base": values.get("api_base"), # Add this line
}
if values["endpoint"] is not None and values["endpoint"] != "":
params["endpoint"] = values["endpoint"]
try:
import qianfan
values["client"] = qianfan.ChatCompletion(**params)
except ImportError:
raise ValueError(
"qianfan package not found, please install it with "
"`pip install qianfan`"
)
return values
This modification ensures that the api_base
parameter is included when initializing the qianfan.ChatCompletion
client, which should resolve the issue of it not being properly propagated in client calls.
Please try this solution and let us know if it resolves your issue. If you have any further questions or run into any other issues, feel free to ask.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot You may be correct, but you have erred in providing a possible solution. You have done it about the file libs/langchain/langchain/chat_models/baidu_qianfan_endpoint.py, when it should have been about the file libs/langchain/langchain/chat_models/litellm.py. Specifically, the possible code snippet to modify according to your previous instructions would be:
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate api key, python package exists, temperature, top_p, and top_k."""
try:
import litellm
except ImportError:
raise ChatLiteLLMException(
"Could not import google.generativeai python package. "
"Please install it with `pip install google-generativeai`"
)
values["openai_api_key"] = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY", default=""
)
values["azure_api_key"] = get_from_dict_or_env(
values, "azure_api_key", "AZURE_API_KEY", default=""
)
values["anthropic_api_key"] = get_from_dict_or_env(
values, "anthropic_api_key", "ANTHROPIC_API_KEY", default=""
)
values["replicate_api_key"] = get_from_dict_or_env(
values, "replicate_api_key", "REPLICATE_API_KEY", default=""
)
values["openrouter_api_key"] = get_from_dict_or_env(
values, "openrouter_api_key", "OPENROUTER_API_KEY", default=""
)
values["cohere_api_key"] = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY", default=""
)
values["huggingface_api_key"] = get_from_dict_or_env(
values, "huggingface_api_key", "HUGGINGFACE_API_KEY", default=""
)
values["together_ai_api_key"] = get_from_dict_or_env(
values, "together_ai_api_key", "TOGETHERAI_API_KEY", default=""
)
values["client"] = litellm
if values["temperature"] is not None and not 0 <= values["temperature"] <= 1:
raise ValueError("temperature must be in the range [0.0, 1.0]")
if values["top_p"] is not None and not 0 <= values["top_p"] <= 1:
raise ValueError("top_p must be in the range [0.0, 1.0]")
if values["top_k"] is not None and values["top_k"] <= 0:
raise ValueError("top_k must be positive")
return values
π€
Thank you for the detailed report, DavidLMS. You're correct in your analysis that the api_base
parameter is not being propagated correctly in the ChatLiteLLM
class. This is indeed due to the api_base
parameter not being passed to the litellm
client in the validate_environment
method.
To resolve this issue, you can modify the validate_environment
method to include the api_base
parameter when initializing the litellm
client. Here's how you can do it:
def validate_environment(self):
if self.client is None:
self.client = litellm.LiteLLM(api_base=self.api_base)
This should ensure that the api_base
parameter is correctly utilized when making requests to the LiteLLM client.
However, it's also important to note that the api_base
parameter is set in the _client_params
property, but this property is not used in the acompletion_with_retry
and completion_with_retry
methods where the API calls are made.
To fix this, you can modify these methods to use the _client_params
property instead of directly using the client
object. Here's how you can do it:
def _completion_with_retry(**kwargs: Any) -> Any:
return self.client.completion(**self._client_params, **kwargs)
return _completion_with_retry(**kwargs)
And for the async version:
async def _completion_with_retry(**kwargs: Any) -> Any:
return await self.client.acreate(**self._client_params, **kwargs)
return await _completion_with_retry(**kwargs)
This way, the api_base
and other client parameters will be correctly propagated in the API calls.
Please try these modifications and let me know if the issue persists. If you have any further questions or need additional assistance, feel free to ask.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
I will try to make a Pull Request that solves the problem.
Solved in this PR #14393
System Info
Who can help?
@hwchase17 @agola11
Information
Related Components
Reproduction
Error:
Is the same error if:
chat_model = ChatLiteLLM(model="huggingface/Intel/neural-chat-7b-v3-1")
So api_base parameter not properly propagated in client calls in ChatLiteLLM.
Expected behavior
I would expect the ChatLiteLLM instance to correctly utilize the api_base parameter when making requests to the LiteLLM client. This should enable using models larger than the default size limit without encountering the error message about model size limits.
Notably, if I explicitly add the api_base argument in chat_models/litellm.py on line 239 (e.g.,
return self.client.completion(api_base=self.api_base, **kwargs)
), the problem is resolved. This suggests that the api_base argument is not being correctly passed through **kwargs.