Open elijahbenizzy opened 5 months ago
Adding to the discussion, I think the "pydantic" and the "decorator/class" approaches could be dubbed "centralized" vs "decentralized" state model.
I'll be focusing on the benefits of "centralized" state model, which could be slightly different than the above "typed state" benefits. The simplest integration would be to subclass State
and BaseModel
to add some basic functionalities.
from pydantic import BaseModel
from burr.core import State
class BurrState(BaseModel, State):
foo: int
bar: str
Then, the model is passed to the ApplicationBuilder
app = (
ApplicationBuilder()
.with_actions(...)
.with_transition(...)
.with_state(model=BurrState())
.with_entrypoint(...)
.build()
We can ensure that all writes
and reads
field are present on the BurrState
. If we also support "decentralized" type annotation on the @actions
we could ensure that both match.
Instead of passing values field-by-field to the ApplicationBuilder
via .with_state()
, you can set default values on the BurrState
or specify which fields are Optional
or required before starting the application. Using Pydantic models also allows to subclass and nest models when required to manage complexity. You can also instantiate multiple objects for different configs (dev vs. prod, overrides, debugging) or test cases
Pydantic has many validation functions including validate_assignment which could trigger validation of specific fields on state.update()
or state.append()
. Broadly speaking, it seems a reasonable development approach to manage state centrally and define "what's legal". As your application grows in complexity and the state machine goes brrrr, it's best to have "one source of truth" for validation
Many LLM tools leverage Pydantic. For instance, a BurrState
would allow to define schemas for LanceDB and automatically enable embedding fields for "agent" memory. Another interesting avenue is FastUI, which would allow to automatically build UI components for inputs and display of State
fields.
Adding to the discussion, I think the "pydantic" and the "decorator/class" approaches could be dubbed "centralized" vs "decentralized" state model.
I'll be focusing on the benefits of "centralized" state model, which could be slightly different than the above "typed state" benefits. The simplest integration would be to subclass
State
andBaseModel
to add some basic functionalities.from pydantic import BaseModel from burr.core import State class BurrState(BaseModel, State): foo: int bar: str
Then, the model is passed to the
ApplicationBuilder
app = ( ApplicationBuilder() .with_actions(...) .with_transition(...) .with_state(model=BurrState()) .with_entrypoint(...) .build()
Graph structure validation
We can ensure that all
writes
andreads
field are present on theBurrState
. If we also support "decentralized" type annotation on the@actions
we could ensure that both match.Default state
Instead of passing values field-by-field to the
ApplicationBuilder
via.with_state()
, you can set default values on theBurrState
or specify which fields areOptional
or required before starting the application. Using Pydantic models also allows to subclass and nest models when required to manage complexity. You can also instantiate multiple objects for different configs (dev vs. prod, overrides, debugging) or test casesData validation
Pydantic has many validation functions including validate_assignment which could trigger validation of specific fields on
state.update()
orstate.append()
. Broadly speaking, it seems a reasonable development approach to manage state centrally and define "what's legal". As your application grows in complexity and the state machine goes brrrr, it's best to have "one source of truth" for validationIntegrations
Many LLM tools leverage Pydantic. For instance, a
BurrState
would allow to define schemas for LanceDB and automatically enable embedding fields for "agent" memory. Another interesting avenue is FastUI, which would allow to automatically build UI components for inputs and display ofState
fields.
Good overview. Some other considerations:
State
API look like? How much can we override pydantic? Usually Burr is immutable, but a pydantic model doesn't have to be that (necessarily)OK, API decision, this is up next on implementation. Will support a few different ways to do it -- key is that it all compiles.
We support centralized and decentralized. Inputs are typed as normal.
As long as we have a spec of types, it's pretty easy:
# stdlib
OverallState = TypedState[{"a": int, "b": int, "c": int, "d": int}]
OverallState = TypedState[ABCDDataclass]
# with pydantic plugin
OverallState = TypedState[ABCDPydanticModel]
Then we can use:
@action(reads=["a", "b"], writes=["c", "d"])
def foo(state: OverallState) -> OverallState:
pass
Note you can also define this anonymously. Probably going to require the reads/writes still, but if you think about it it's technically optional...
@action(reads=["a", "b"], writes=["c", "d"])
def foo(state: State[{"a": int, "b": int}]) -> State[{"c": int, "d": int}]:
pass
graph = GraphBuilder()....with_typing(TypedState) # or on the application builder
Note this will work with or without the above -- more likely one would do the other
burr.typing.get_type_dict(graph)
burr.typing.get_action_input_dict(graph, action)
burr.typing.get_action_state_input_dict(graph, action)
burr.typing.get_action_state_output_dict(graph, action)
b_pydantic.get_type_model(graph, exclude=..., include=...)
b_pydantic.get_action_input_model(graph, action)
b_pydantic.get_action_state_input_model(graph, action)
b_pydantic.get_action_state_output_model(graph, action)
MyState = PydanticState[MyModel]
class MyState:
substate:
@action(reads=["a", "b"], writes=["a", "b"])
def my_action(state: MyState) -> MyState:
state.c = fn(state.a, state.b)
state.d =...
return state
@action(reads=["a", "b"], writes=["c", "d"])
def my_action(state: MyState) -> MyState:
state.model.c = fn(state.a, state.b)
state.model.d =...
return state
class PydanticState:
a: Optional[AModel]
b: Optional[BModel]
@action(reads=["a", "b"], writes=["c", "d"])
# the state model is the dynamically subsetted one
def my_action(state: PydanticState) -> PydanticState:
state.model.c = fn(state.a, state.b)
state.model.d = ...
state.model.e # throw an error (not allowed to read from e, doesn't exist/not declared)
state.model.a = "foo" # throw an error, because you didn't declare it
state.model.d =...
return state
@action(reads=["*"], writes=[...])
# subset everything?
def my_action(state: PydanticState) -> PydanticState:
# do the kitchen sink -- whatever you want with the whole state
return state ...
## Idea -- give everything state
@action(reads="*", writes=["a", "b"])
def my_action(state: MyState) -> MyState:
state.c = fn(state.a, state.b)
state.d =...
return state
Application[MyState]
builder.with_state(MyState).build()
application.state # IDE should know MyState
state, ... = application.run(...) #IDE should konw MyState
See #350
Currently state is untyped. Ideally this should be able to leverage a pydantic model, and possibly a typed dict/whatnot. We'll need the proper abstractions however.
Some requirements:
writes
/reads
potentially). Then we have different actions that can be compiled together. We should also be able to do this centrally.Any
, which is bidirectionally compatible with typing.Ideas:
pydantic
-
hard to do transactional updates+
IDE integration is easy+
easy integration with web-services/fastAPI~
subsetting is a bit of work, but we can bypass that by using the whole statein the decorator/class
-
No free IDE integration (without a plugin)+
simple, loosely coupled, easy to inspect~
duplicated between readers and writers (can't decide if this is bad?)