Add support for Anthropic's prompt caching feature to the SpiceMessages class in the spice library. This will enable faster and more cost-efficient API calls by reusing cached prompt prefixes. Additionally, track cache performance metrics to verify cache hits and the number of input tokens cached.
Changes Required
Update SpiceMessages Class:
Add a cache argument to the message creation methods.
Set the cache_control parameter based on the cache argument.
Modify Message Creation Functions:
Update the message creation functions to handle the cache argument.
Track Cache Performance Metrics:
Update the get_response method in the Spice class to handle the new API response fields related to caching.
Log the number of input tokens cached and verify cache hits using the client.extract_text_and_tokens method.
Implementation Details
1. Update SpiceMessages Class
Modify the SpiceMessages class to include the cache argument:
class SpiceMessages(UserList[SpiceMessage]):
...
def add_message(self, role: Literal["user", "assistant", "system"], content: str, cache: bool = False):
self.data.append(create_message(role, content, cache))
def add_user_message(self, content: str, cache: bool = False):
"""Appends a user message with the given content."""
self.data.append(user_message(content, cache))
def add_system_message(self, content: str, cache: bool = False):
"""Appends a system message with the given content."""
self.data.append(system_message(content, cache))
def add_assistant_message(self, content: str, cache: bool = False):
"""Appends an assistant message with the given content."""
self.data.append(assistant_message(content, cache))
...
2. Modify Message Creation Functions
Update the message creation functions to handle the cache argument:
def create_message(role: Literal["user", "assistant", "system"], content: str, cache: bool = False) -> ChatCompletionMessageParam:
message = {"role": role, "content": content}
if cache:
message["cache_control"] = {"type": "ephemeral"}
return message
def user_message(content: str, cache: bool = False) -> ChatCompletionUserMessageParam:
"""Creates a user message with the given content."""
return create_message("user", content, cache)
def system_message(content: str, cache: bool = False) -> ChatCompletionSystemMessageParam:
"""Creates a system message with the given content."""
return create_message("system", content, cache)
def assistant_message(content: str, cache: bool = False) -> ChatCompletionAssistantMessageParam:
"""Creates an assistant message with the given content."""
return create_message("assistant", content, cache)
3. Track Cache Performance Metrics
Update the get_response method in the Spice class to handle the new API response fields related to caching:
Here's an example of how you might use the updated SpiceMessages class with caching:
from spice import Spice
from spice.spice_message import SpiceMessages
client = Spice()
messages = SpiceMessages(client)
messages.add_system_message("You are an AI assistant tasked with analyzing literary works.", cache=True)
messages.add_user_message("Analyze the major themes in 'Pride and Prejudice'.", cache=True)
response = await client.get_response(messages=messages, model="claude-3-5-sonnet-20240620")
print(response.text)
Acceptance Criteria
The SpiceMessages class should support the cache argument.
The get_response method should log cache performance metrics.
The implementation should be backward compatible and ignore the cache argument for non-Anthropic clients.
Summary
Add support for Anthropic's prompt caching feature to the
SpiceMessages
class in thespice
library. This will enable faster and more cost-efficient API calls by reusing cached prompt prefixes. Additionally, track cache performance metrics to verify cache hits and the number of input tokens cached.Changes Required
Update
SpiceMessages
Class:cache
argument to the message creation methods.cache_control
parameter based on thecache
argument.Modify Message Creation Functions:
cache
argument.Track Cache Performance Metrics:
get_response
method in theSpice
class to handle the new API response fields related to caching.client.extract_text_and_tokens
method.Implementation Details
1. Update
SpiceMessages
ClassModify the
SpiceMessages
class to include thecache
argument:2. Modify Message Creation Functions
Update the message creation functions to handle the
cache
argument:3. Track Cache Performance Metrics
Update the
get_response
method in theSpice
class to handle the new API response fields related to caching:Example Usage
Here's an example of how you might use the updated
SpiceMessages
class with caching:Acceptance Criteria
SpiceMessages
class should support thecache
argument.get_response
method should log cache performance metrics.cache
argument for non-Anthropic clients.