Open anqma opened 3 months ago
@anqma Привет! I'm here to assist you with any issues or questions you may have. Feel free to ask for help while we wait for a human maintainer.
To ensure that an error is raised when using the parameter model
instead of model_name
in the BedrockEmbedding
class, you can modify the __init__
method of the BedrockEmbedding
class to check for the presence of the model
parameter and raise a ValueError
if it is found. Here is an example of how you can implement this:
class BedrockEmbedding(BaseEmbedding):
model_name: str = Field(description="The modelId of the Bedrock model to use.")
# other fields...
def __init__(
self,
model_name: str = Models.TITAN_EMBEDDING,
profile_name: Optional[str] = None,
aws_access_key_id: Optional[str] = None,
aws_secret_access_key: Optional[str] = None,
aws_session_token: Optional[str] = None,
region_name: Optional[str] = None,
client: Optional[Any] = None,
botocore_session: Optional[Any] = None,
botocore_config: Optional[Any] = None,
additional_kwargs: Optional[Dict[str, Any]] = None,
max_retries: int = 10,
timeout: float = 60.0,
callback_manager: Optional[CallbackManager] = None,
# base class
system_prompt: Optional[str] = None,
messages_to_prompt: Optional[Callable[[Sequence[ChatMessage]], str]] = None,
completion_to_prompt: Optional[Callable[[str], str]] = None,
pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT,
output_parser: Optional[BaseOutputParser] = None,
**kwargs: Any,
):
if 'model' in kwargs:
raise ValueError("Use 'model_name' instead of 'model' to specify the model.")
additional_kwargs = additional_kwargs or {}
session_kwargs = {
"profile_name": profile_name,
"region_name": region_name,
"aws_access_key_id": aws_access_key_id,
"aws_secret_access_key": aws_secret_access_key,
"aws_session_token": aws_session_token,
"botocore_session": botocore_session,
}
try:
import boto3
from botocore.config import Config
config = (
Config(
retries={"max_attempts": max_retries, "mode": "standard"},
connect_timeout=timeout,
read_timeout=timeout,
)
if botocore_config is None
else botocore_config
)
session = boto3.Session(**session_kwargs)
except ImportError:
raise ImportError(
"boto3 package not found, install with 'pip install boto3'"
)
if client is not None:
self._client = client
elif "bedrock-runtime" in session.get_available_services():
self._client = session.client("bedrock-runtime", config=config)
else:
self._client = session.client("bedrock", config=config)
super().__init__(
model_name=model_name,
max_retries=max_retries,
timeout=timeout,
botocore_config=config,
profile_name=profile_name,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_session_token=aws_session_token,
region_name=region_name,
botocore_session=botocore_session,
additional_kwargs=additional_kwargs,
callback_manager=callback_manager,
system_prompt=system_prompt,
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
pydantic_program_mode=pydantic_program_mode,
output_parser=output_parser,
**kwargs,
)
This code snippet checks if the model
parameter is present in the keyword arguments (kwargs
). If it is, a ValueError
is raised with a message indicating that model_name
should be used instead. This will prevent the unintended use of the default model by ensuring that the correct parameter is used [1][2][3].
To continue talking to Dosu, mention @dosu.
part of this is a consequence of how pydantic works, and also partly the abuse of kwargs.
The only solution to this is to explicitly raise an error in the code 🤷🏻 Obviously, the model/model_name kwarg should be the same across LLMs and embedding models, but it isn't right now
kwargs abuse is ruining this lib. A lot of unexpected/silenced bad behaviors
Bug Description
The use of the "model" instead of "model_name" in BedrockEmbedding class led to the automatic selection of the default model and doesn't raise an error. This might lead to use the wrong model.
Тhe following code calls the default model which is "amazon.titan-embed-text-v1" : model = BedrockEmbedding(model = "cohere.embed-multilingual-v3")
The following code calls the desired model ("cohere.embed-multilingual-v3"): model = BedrockEmbedding(model_name = "cohere.embed-multilingual-v3")
Version
0.10.44
Steps to Reproduce
Import the BedrockEmbedding class from the llama_index package in the Python script or notebook
Instantiate the BedrockEmbedding class using the model instead of model_name. This is the incorrect usage that leads to the automatic selection of the default model. model = BedrockEmbedding(model="cohere.embed-multilingual-v3")
Since the model parameter is incorrectly used, the BedrockEmbedding class should automatically select the default model ("amazon.titan-embed-text-v1") instead of the desired model ("cohere.embed-multilingual-v3"). There should be no error message indicating that the wrong parameter was used, which is the crux of the bug.
Attempt to instantiate the BedrockEmbedding class using the correct parameter, model_name, to specify the desired model. model = BedrockEmbedding(model_name="cohere.embed-multilingual-v3")
Relevant Logs/Tracbacks
No response