Closed willbakst closed 3 months ago
Yah i think the main impetus for this feedback is:
call
internally). this pattern has felt gross to me bc 1/ it requires setting internal state for things that really aren't stateful, just so they get passed into the prompt template, and 2/ the call
function is still part of the public interface of the class, and depending on whether or not you've called my wrapper function before, the call
function will either fail (bc it's using empty-string initialized private vars) or it will "succeed" with whatever the last args wereAlternatively, you could push things towards a world where call's meant to be used once and discarded (therefore stateless), in which case the current pattern works. But I think the examples in your docs that leverage the fact that calls can be stateful should then change, and you'd probably want to come up with some new patterns that wrap calls and manage state.
So sort of depends on where, philosophically, you view call classes sitting in the longer-term vision for the library. Are agents / chatbots / more stateful things subclasses of calls? Or are they wrappers around calls?
Inside of mirascope I want to make it really easy to implement an agent while maintaining flexibility and extensibility. I view this as first making it as easy as possible to implement an agent as a subclass of a call class in a way that doesn't smell. I think separating call arguments from state and enabling overwriting call
, stream
, call_async
, and stream_async
is the best way to do this that I can currently think of. Should also do this for extractors.
Thank you for the feedback and suggestions here!
Of course, this will result in a lot of repetitive code across agents if you as the end user need to define each of these methods for each agent. This is where I think something like BaseAgent
with subclasses e.g. OpenAIAgent
would come in as a second step. These classes would have methods like step
/run
or something that handle a lot of the duplicated code. The interface here has to maintain maximum extensibility / flexibility for writing more custom flows. I imagine something like a step being a single call that handles state update around history and exposes handlers to handle content and tools, and then just running the step in a loop until the agent says the loop is done.
The agent stuff definitely needs more thought and design work. Lots of examples to see how to provide the best interface.
I'm also interested in implementing more of an agent framework as a separate library built on top of mirascope with a lot of the same philosophy, except that library will have more fleshed out workflows like ReAct or Query Planning as pre-built execution flows (making them a little less extensible). Ideally this framework would be easy to build because BaseAgent
classes provide a fantastic interface for easily implementing all the different agent workflows that pop up every day. This would also make it great for researches to implement new research using Mirascope so long as we provide detailed examples and great docs.
@willbakst
I certainly feel it is essential to distinguish between the Arguments and the State.
I have thought about how to realize this idea.
Perhaps there is no problem with doing call argument validation in runtime.
However, we need to think a bit about static type checking.
I want things to be properly types so the consumer knows to pass in the genre when calling a call.
I certainly agree with this approach. The user should be able to understand explicitly from the IDE, etc., which arguments to pass.
However, I don't think the existing Python specification allows for full static type checking to be applied to call
method's signatures with your idea.
Not 100% how to best implement this yet... Likely some version of paramspec update with generics on base and subclass calls and ...
As you stated, it certainly feels like it could be achieved using ParamSpec and Concatenate.
However, the current Python specification should not allow you to expand constructor signatures from Pydantic models.
To explain the root cause a little more, it is impossible to expand parameters from a callable because We don't have the way such as ParametersOf as in the PEP of the ParamsSpec.
ParametersOf
was rejected in the PEP.
Currently, ParamSpec
assumes a case where the arguments of the Callable are explicitly specified, such as decorators, so I don't think it can be used in this use case.
As I don't have time today, I will think about the CallArgs
handover method again tomorrow.
As I don't have time today, I will think about the CallArgs handover method again tomorrow.
I investigated and considered good approaches for defining types to realize your code example. At least, I couldn't find a good way to define types that work with Pyright. Even if it could be defined in a very complex and tricky way, I don't think it would be possible to achieve completion and validation with all type checkers and IDEs, so I thought it might be better to implement it a bit more explicitly. After all, I felt it was not intuitive because there is no definition of the call signature in the code. (Pydantic also uses magic in the signature of __init__
, but features like dataclass and attrs, as well as similar dataclass functionality in Kotlin or other languages, are provided, and there is understanding among developers, and support from IDEs and type checkers, so it does not seem to be a problem.)
Supplement: In my understanding, Pyright supports fairly up-to-date PEPs. It also handles complex type analysis. (Last month, I wrote a decorator using ParamSpec and Concatenate, but type-checking didn't work correctly with mypy.) Also, since many projects still use mypy, even if it could be expressed using complex Generics type definitions utilizing the latest typing, I thought it wouldn't be checked correctly in many users' environments. Regarding IDEs, I also thought the same problem would occur due to the differences in the implementation of each LSP and PyCharm.
I considered using CallArgs
as an argument for call()
as a method to overcome the above issues. This method is quite simple, and the solution with Generics is clear. Users must pass a CallArgs
instance to call()
. Regarding Pydantic, there are no issues with completion or type checking, and even without the support of editors or type checkers, the necessary callArgs can be confirmed, thus improving readability. The drawback is that it is necessary to import and create the corresponding CallArgs
model each time, which might seem cumbersome, but I think it is within an acceptable range. It clarifies the necessary arguments, and I believe it will be seen positively by users.
What do you think about this idea?
Here is an example code:
from abc import ABC
from typing import Generic, TypeVar
from pydantic import BaseModel
class CallArgs(BaseModel):
...
CallArgsT = TypeVar("CallArgsT", bound=CallArgs)
class OpenAICall(BaseModel, Generic[CallArgsT], ABC):
def call(self, *, call_args: CallArgsT, **kwargs) -> ...:
...
@property
def call_args(self) -> CallArgsT:
...
class BookRecommenderArgs(CallArgs):
genre: str
class AuthorSelector(OpenAICall[BookRecommenderArgs]):
prompt_template = "Recommend an author that writes {genre} books."
class BookRecommender(OpenAICall[BookRecommenderArgs]):
prompt_template = "Recommend a book written by {author}."
author_selector: AuthorSelector = AuthorSelector()
@property
def author(self) -> str:
return self.author_selector.call(call_args=BookRecommenderArgs(genre=self.call_args.genre))
recommender = BookRecommender()
response = recommender.call(call_args=BookRecommenderArgs(genre="fantasy"))
print(response.content)
If this is the best solution possible / we can come up with, I'm not necessarily opposed.
It achieves the separation of arguments and state. We just need to also make sure that the call arguments can be optional, and we need to make sure the user can override the call
and other such methods, including even the return type.
However, I don't think this is ideal.
Could we implement it as a decorator to get the param spec? Something like:
from mirascope.base import call_args
from mirascope.openai import OpenAICall
@call_args(genre=str)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book"
recommender = BookRecommender()
response = recommender.call(genre="fantasy")
print(response.content)
Realizing we'll also likely need to bubble the functionality all the way down to BasePrompt
no matter what, so we should also think through what the flow for BasePrompt
looks like.
Top of mind is something like this:
from mirascope.base import BasePrompt, call_args
@call_args(genre=str)
class BookRecommendationPrompt(BasePrompt):
prompt_template = "Recommend a {genre} book"
prompt = BookRecommendationPrompt()
messages = prompt.messages(genre="fantasy")
print(messages)
#> [{"role": "user", "content": "Recommend a fantasy book"}]
I agree with your opinion.
I have considered the feasibility of the new proposal.
Currently, I think it is difficult for this approach to work correctly with static type checking.
I apologize for not being able to realize your idea, but the current Python lacks sufficient support for complex type concepts.
Specifically, decorators for classes do not work correctly with type checkers like mypy.
To be more specific, since Python does not support Intersection
, it is difficult to use decorators for classes to modify the arguments of the call method.
https://github.com/python/typing/issues/213
Additionally, it is challenging to dynamically generate method signatures from type information defined in arguments like @call_args(genre=str)
for use in static type checking.
Even though pydantic.create_model()
generates a Pydantic model from arguments, it does not provide type information usable for static analysis.
https://github.com/pydantic/pydantic/blob/d93a0460e89f4de67bae1af829c9d028b02e4964/pydantic/main.py#L1459-L1471
To solve this issue, I added a feature to the PyCharm Pydantic plugin to analyze create_model
. I created this plugin because Pydantic constructors could not be type-checked.
With the introduction of PEP 681 (@dataclass_transform
), static type analysis for datalike model constructors like Pydantic is finally supported in mypy and pyright (the PEP was proposed to support pyright and VSCode).
https://peps.python.org/pep-0681
I also considered defining call_args
with TypedDict
and unpacking it as the type hint for the call
method, but this is also not supported.
https://github.com/python/typing/issues/1399
In other words, the only way to dynamically change method arguments while supporting type checkers is to use decorators on methods.
Here is an example. Although mypy and pyright will work correctly, there may not be many benefits for users when implementing it.
class BookRecommenderArgs(CallArgs):
genre: str
class BookRecommender(OpenAICall):
prompt_template = "Recommend a book written by {author}."
@call_args(BookRecommenderArgs)
def call(self, *args, **kwargs) -> ...:
response = super().call(*args, **kwargs)
# do something with response
return response
BookRecommender().call(genre="fantasy", xyz=123)
# mypy main.py
# >>> ... error: Unexpected keyword argument "xyz" for "call" of "BookRecommender" [call-arg]
Instead of this, it might be better to dynamically validate and map the signature of the user-defined call method within super().call(*args, **kwargs)
.
Realizing we'll also likely need to bubble the functionality all the way down to BasePrompt no matter what, so we should also think through what the flow for BasePrompt looks like.
I think this can also be implemented in the same way as OpenAICall. Are there any specific concerns?
It achieves the separation of arguments and state. We just need to also make sure that the call arguments can be optional, and we need to make sure the user can override the call and other such methods, including even the return type.
I have one concern with my proposal: how to distinguish between required and optional call_args. It seems challenging to differentiate between required and optional solely using generics.
I think this can also be implemented in the same way as OpenAICall. Are there any specific concerns?
I just want to make sure as we design/implement this that we are mindful of making it possible for BasePrompt
as well. Since BaseCall
and BaseExtractor
subclass BasePrompt
, this will inevitably impact how we should implement things in the library.
We also need to make sure that we can separate between the call arguments provided by the user for the prompt template vs. the call parameters the user can provide through **kwargs
that get passed into the API call to e.g. OpenAI.
Instead of this, it might be better to dynamically validate and map the signature of the user-defined call method within super().call(*args, **kwargs).
I think my key desires here are:
call
and other such methods as described.recommender.call(genre="fantasy")
shows you that you need to provide genre
.call(self, *, genre: str, **kwargs)
and super().call(genre=genre, **kwargs)
instead of just *args
and **kwargs
For (4), I imagine something like this:
class BookRecommenderArgs(CallArgs):
genre: str
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
recommender = BookRecommender()
# all of the below should fail with unexpected keyword arg "xyz"
recommender.call(genre="fantasy", xyz=123)
recommender.call_async(genre="fantasy", xyz=123)
recommender.stream(genre="fantasy", xyz=123)
recommender.stream_async(genre="fantasy", xyz=123)
But it sounds like you're saying this isn't possible because even though we could wrap every method inside of BookRecommender
, the static type checker won't catch it?
I have one concern with my proposal: how to distinguish between required and optional call_args. It seems challenging to differentiate between required and optional solely using generics.
For call arguments here, I don't think there's really any reason to support "optional" so much as just optional through default. The reason is that the prompt template will be dependent on the call args, so we'll need some value to work with. The point is to require these call arguments for the user of the class. In this case, we should be able to just define fields with defaults? For example:
class BookRecommenderArgs(CallArgs):
genre: str = "fantasy"
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
BookRecommender().call() # genre is optional and defaults to "fantasy"
I apologize for not being able to realize your idea, but the current Python lacks sufficient support for complex type concepts.
Not your fault Python doesn't support it! Makes me wonder if we're thinking too boxed in to the current design. Maybe there's a different way we could approach this to make it possible with existing python?
For example, what if we just accepted the limitations of Python and implemented the feature without optimal static type checking (and add better type checking when possible re: issues you linked / maybe a VSCode + PyCharm extension for it?).
Consider the following:
class BookRecommenderArgs(CallArgs):
genre: str
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
recommender = BookRecommender()
response = recommender.call(genre="fantasy") # no type hints but fails if genre isn't provided
print(response.content)
We can make the above possible, it's just not great wrt. static type checking. Then, we can add better static type checking if/when Python makes it possible? I wouldn't suggest this if there seemed to be a better route, but unfortunately it seems like this is the best option?
A nice benefit of doing it this way is that overriding the call method should be easy, meaning that implemented methods that are staticly typed would be something like this:
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
def call(self, *, genre: str, **kwargs) -> OpenAICallResponse:
return super().call(genre=genre, **kwargs)
...
def stream(self, *, genre: str, **kwargs) -> Generator[OpenAICallResponseChunk, None, None]:
yield from super().stream(genre=genre, **kwargs)
...
Of course, this isn't ideal, but it's better I think than anything else so far?
Thank you for your suggestions.
- The user can override the call and other such methods as described.
I think this is not a problem.
- There's a level of type hinting and editor support so that e.g. recommender.call(genre="fantasy") shows you that you need to provide genre.
As you understand, this is not supported due to the current limitations of Python. I will provide future solutions for this later.
- Ideally some level of type hint for the user when overriding the method (with some form of mapping like you've mentioned?) Something like call(self, , genre: str, kwargs) and super().call(genre=genre, kwargs) instead of just args and **kwargs
This cannot provide completion or warnings in editors or type checkers. However, at runtime, we can dynamically check the method and signature during class definition, so it can be easily noticed when running locally with unittest, etc. At that time, appropriate correction information can be displayed to the user as an error message.
- Make it as easy as possible to provide call arguments without overriding every function.
As you mentioned, this does not benefit from type checkers. It cannot be checked at runtime until call() is actually executed, which may result in lower usability during implementation and delay in discovering argument errors.
But it sounds like you're saying this isn't possible because even though we could wrap every method inside of BookRecommender, the static type checker won't catch it?
Yes, my concern is that unless users implement unittests locally with proper mocking of external API calls, it will be difficult to discover issues. We wouldn't want incidents after deploying to production.
In this case, we should be able to just define fields with defaults? For example:
I agree with this approach. I don't see any problems with it.
For example, what if we just accepted the limitations of Python and implemented the feature without optimal static type checking (and add better type checking when possible re: issues you linked / maybe a VSCode + PyCharm extension for it?).
In the end, our discussion highlighted the limitations of Python's existing type checking at the language level.
While hoping these issues will be resolved in the future, I propose some workarounds here (of course, creating a PEP to improve the language specification in parallel is ideal).
To cover everything, here are the conclusions:
If ruff supports plugins, it would be the best option since it is used in many projects and I already provide the plugin for PyCharm. Although ruff does not support completion, it can issue warnings and is expected to provide type checking in the future, which aligns with our goals. (I heard from a member of the development team at PyCon US that ruff will be offering a type checker in a few months.)
But unfortunately it seems like this is the best option?
Of course, this isn't ideal, but it's better I think than anything else so far?
I agree with your proposal, but I have one concern.
- Make it as easy as possible to provide call arguments without overriding every function.
As I mentioned earlier, I am a little worried about unexpected errors due to wrong arguments when calling.
As I mentioned earlier, I am a little worried about unexpected errors due to wrong arguments when calling.
I agree. I think if we go down this path, the recommendation when using call args would be to do the following:
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
def call(self, *, genre: str, **kwargs) -> OpenAICallResponse:
return super().call(genre=genre, **kwargs)
async call(self, *, genre: str, **kwargs) -> OpenAICallResponse:
return await super().call_async(genre=genre, **kwargs)
def stream(self, *, genre: str, **kwargs) -> Generator[OpenAICallResponseChunk, None, None]:
yield from super().stream(genre=genre, **kwargs)
async stream(self, *, genre: str, **kwargs) -> AsyncGenerator[OpenAICallResponseChunk, None]:
async for chunk in super().stream_async(genre=genre, **kwargs):
yield chunk
This provides type checker and runtime support requiring genre
. However, it's a bit verbose / inconvenient to have to write this every single time I want to use call arguments.
With the above implementation, the following would still work regardless:
@call_args(BookRecommenderArgs)
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
recommender = BookRecommender()
response = recommender.call(genre="fantasy") # no type checking here since user didn't define it
Can you think of a way to provide a more convenience way of implementing the first snippet so we still get type checking but it's not as verbose / inconvenient to write for every call?
Random thought: could we do something like ClassVar
or PrivateAttr
to make a field a call argument instead?
I'm thinking something like this could be really cool (but not possible??):
from mirascope.base import CallArg
from mirascope.openai import OpenAICall
class BookRecommender(OpenAICall):
prompt_template = "Recommend a {genre} book."
genre: CallArg[str]
recommender = BookRecommender()
response = recommender.call(genre="fantasy")
print(response.content)
Could we somehow then get type hints on the OpenAICall
methods typed by all CallArg
fields?
Can you think of a way to provide a more convenient way of implementing the first snippet so we still get type checking but it's not as verbose / inconvenient to write for every call?
I'll think about if there's a good way to do this.
Could we somehow then get type hints on the OpenAICall methods typed by all CallArg fields?
This ultimately becomes the same type checking that Pydantic originally aimed to achieve. In other words, it is not possible to collect the types of specific attributes.
This ultimately becomes the same type checking that Pydantic originally aimed to achieve. In other words, it is not possible to collect the types of specific attributes.
This is very unfortunate. Regardless, we should still try to implement this to the best of our ability given current Python typing limitations.
I'm not upset with the most recent iteration, and if we can find a good shorthand even better.
However, I'm thinking suggestions like #312 might lend themselves better to this desired outcome since we'll have functions that we can properly type given current limitations.
I'll think about if there's a good way to do this.
I think It would be most appropriate to include usage and snippets in the documentation within the method from which it is inherited, as is the case with many libraries.
However, I'm thinking suggestions like https://github.com/Mirascope/mirascope/issues/312 might lend themselves better to this desired outcome since we'll have functions that we can properly type given current limitations.
This seems to be a good approach that works correctly from the type checker's point of view.
The only problem is that it defines a function but no actual processing. This can be a bit confusing for new users. However, this is a simple problem that can be solved with better documentation. Perhaps we need a warning if the actual processing is written within the function since we only use the function's argument and return types.
Also, in order to maintain state such as history, perhaps the response should be a class based on pydantic model that can store such information.
I'm going to call this not possible and instead move work on the principles here to #322
Description
Right now, everything is state, including all template variables. For certain variables this makes sense. Things you want the consumer of the class to provide at construction time. But sometimes the expectation is that the consumer will create an instance of the class and make multiple calls, changing the state between calls. This feels somewhat wrong.
Consider the following example:
In this case,
genre
shouldn't be state. If it were supposed to be state, we would use it more like this:The idea here is that if
genre
is state then it likely won't be changed often (or at all) between calls.But for use-cases where the consumer changes
genre
every call, this flow makes less sense. A better flow would be something like this:Of course, the
prompt_template
may still rely on internal state, and that internal state should be able to access the current call arguments. This is necessary for chaining:The above is extremely rough in terms of the interface design, but the idea is there.
genre
when callingcall
.BaseCall
andBaseExtractor
classes.Not 100% how to best implement this yet. Likely some version of paramspec update with generics on the base and subclass
call
and other methods to accept the arguments + something like acall_args
class variable that we can set at the very beginning of call so that when we access the properties they will have access to the correct call arguments.call_args
could also then be set at definition time to provide default valuesA benefit of doing this is that now users can choose between what is state vs. not state.
If they want
history
to be state and create an agent, great. If they want to create a separate agent class that passes history into separate call class as an argument, great. This provides maximum flexibility.Furthermore, ideally the user would be able to better write an agent by overriding the
call
,stream
etc. methods so that the expectation is to use these base methods and not some new method (unless the creator wants to write that new method, which they can). For example:The above now means that a consumer of
Mathematician
would be expected to use thecall
method rather than some custom method likesolve_problem
. To me this makes more sense. Again, not sure this is the right implementation of the interface, but it's in the right direction.Note also that the agent example
call
internals has a lot of assumed other feature requests / bug fixes that are in the works, but that shouldn't affect this request.