Open Veetaha opened 2 months ago
@dzmitry-lahoda I think I know what compile-time builder crate you are talking about which implements such a feature, and I was thinking about having it at some point in bon
as well. This requires thorough design, because I'd like to avoid having the users of this feature do an unwrap
, for example when they consume such groups of parameters. However, I understand that this may not be avoidable, and there will be some limit to the strictness of types.
For example, with clap
all parameters in a group are just Option<T>
types, and you have an implicit contract that clap
validates the invariants of the group for you.
I was also thinking of having some lower-level API in bon to assist in constructing such complex type states. There is nothing material yet, but at some point this will be the next big feature priority in bon
and then I'll think on the design more thorougly
In the meantime I think the simplest way to achieve member groups is just by using enums. Like this:
#[derive(bon::Builder)]
struct Example {
#[builder(into)]
group: Group
}
#[derive(from_variants::FromVariants)]
enum Group {
Variant1(Variant1),
Variant2(Variant2),
}
#[derive(Builder)]
struct Variant1 { /**/ }
#[derive(Builder)]
struct Variant2 { /**/ }
let example = Example::builder()
.group(Variant1::builder()/*set the members of the group*/.build())
.build();
Note that this snippet is for a hypothetical use case that needs to have mutually exclusive groups of parameters. Your use case may probably be different. Maybe if you provided some real example (or real-ish if your use case is private), just to have specific problem example to start with? That'll help with understanding the problem domain better for me.
Btw. just a heads-up, I'm going to do a 2.2
release of bon on Sunday afternoon that will feature a new #[derive(bon::Builder)]
syntax. It was added for various unobvious reasons that I'll describe in the blog post that I'll publish on Reddit
here is my type for example (I have Market, Token types as well)
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary, Builder))]
pub struct PlaceOrder {
#[cfg_attr(test, builder(default))] // #[builder(default)]
pub session_id: SessionId,
#[cfg_attr(test, builder(default))] // #[builder(default)]
pub market_id: MarketId,
pub side: Side,
pub fill_mode: FillMode,
#[cfg_attr(test, builder(default))] // #[builder(default)]
pub is_reduce_only: bool,
/// must be same decimals as `Price` if not `None` at current state
pub price: Option<NonZeroPriceMantissa>,
#[cfg_attr(test, proptest(filter = "|x| x.is_some()"))]
pub size: Option<NonZeroOrderSize>,
pub quote_size: Option<NonZeroU128>,
/// optional user id which allows to place order on behalf of another user
pub user_id: Option<UserId>,
pub client_order_id: Option<u64>,
}
here is my manual builder(for Market and Token I do not have manual builder yet so):
pub fn order_for(market_id: MarketId) -> OrderTarget<NoSession, NoUser> {
OrderTarget::<NoSession, NoUser> {
market_id,
session_id: NoSession,
user_id: None,
marker: Default::default(),
}
}
/// Typestate safe concise builder for various orders.
///
/// # Implementation details
/// Unfortunately most builders are not really useful,
/// Could try to use some - but still likely not really useful,
/// even after proper PlaceOrder refactoring (so it would be good)
#[allow(dead_code)]
mod private {
use crate::{
FillMode, MarketId, NonZeroOrderSize, NonZeroPriceMantissa, PlaceOrder, SessionId, Side,
UserId,
};
use core::marker::PhantomData;
pub struct FillModeLimit;
pub struct FillModeImmediate;
pub struct FillModePostOnly;
#[derive(Clone, Copy)]
pub struct SessionSet(pub SessionId);
#[derive(Clone, Copy, Default)]
pub struct NoSession;
pub struct NoUser;
pub struct UserSet;
#[derive(Clone, Copy)]
pub struct OrderTarget<Session, User> {
pub market_id: MarketId,
pub session_id: Session,
pub user_id: Option<UserId>,
pub marker: PhantomData<(Session, User)>,
}
impl OrderTarget<NoSession, NoUser> {
pub fn session(&self, session_id: SessionId) -> OrderTarget<SessionSet, NoUser> {
OrderTarget {
session_id: SessionSet(session_id),
market_id: self.market_id,
user_id: self.user_id,
marker: Default::default(),
}
}
pub fn user(&self, user_id: UserId) -> OrderTarget<NoSession, UserSet> {
OrderTarget {
user_id: Some(user_id),
market_id: self.market_id,
session_id: self.session_id,
marker: Default::default(),
}
}
}
impl OrderTarget<SessionSet, NoUser> {
pub fn user(&self, user_id: UserId) -> OrderTarget<SessionSet, UserSet> {
OrderTarget {
user_id: Some(user_id),
market_id: self.market_id,
session_id: self.session_id,
marker: Default::default(),
}
}
}
#[derive(Clone, Copy, Default)]
pub struct NoSide;
pub struct AmountSet;
pub struct NoAmount;
#[derive(Clone, Copy)]
pub struct SideSet(pub Side);
pub struct NoPrice;
pub struct PriceSet;
impl<User> OrderTarget<SessionSet, User> {
pub fn post_only(
&self,
side: Side,
) -> OrderBuilder<FillModePostOnly, SideSet, NoAmount, NoPrice> {
OrderBuilder {
fill_mode: FillMode::PostOnly,
session_id: self.session_id.0,
market_id: self.market_id,
user_id: self.user_id,
side: SideSet(side),
marker: Default::default(),
price: None,
size: None,
}
}
pub fn limit(&self, side: Side) -> OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
OrderBuilder {
fill_mode: FillMode::Limit,
session_id: self.session_id.0,
market_id: self.market_id,
user_id: self.user_id,
side: SideSet(side),
marker: Default::default(),
price: None,
size: None,
}
}
pub fn fill_or_kill(
&self,
side: Side,
) -> OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
OrderBuilder {
fill_mode: FillMode::FillOrKill,
session_id: self.session_id.0,
market_id: self.market_id,
user_id: self.user_id,
side: SideSet(side),
marker: Default::default(),
price: None,
size: None,
}
}
pub fn immediate_or_cancel(
&self,
side: Side,
) -> OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
OrderBuilder {
fill_mode: FillMode::ImmediateOrCancel,
session_id: self.session_id.0,
market_id: self.market_id,
user_id: self.user_id,
side: SideSet(side),
marker: Default::default(),
price: None,
size: None,
}
}
}
impl OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
pub fn size(&self, size: NonZeroOrderSize) -> PlaceOrder {
PlaceOrder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: self.side.0,
price: None,
size: Some(size),
is_reduce_only: false,
quote_size: None,
client_order_id: None,
}
}
pub fn price(&self, price: NonZeroPriceMantissa) -> PlaceOrder {
PlaceOrder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: self.side.0,
price: Some(price),
size: None,
is_reduce_only: false,
quote_size: None,
client_order_id: None,
}
}
}
impl OrderBuilder<FillModePostOnly, SideSet, NoAmount, NoPrice> {
pub fn amount(&self, size: NonZeroOrderSize, price: NonZeroPriceMantissa) -> PlaceOrder {
PlaceOrder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: self.side.0,
price: Some(price),
size: Some(size),
is_reduce_only: false,
quote_size: None,
client_order_id: None,
}
}
}
impl OrderTarget<NoSession, UserSet> {
pub fn session(&self, session_id: SessionId) -> OrderTarget<SessionSet, UserSet> {
OrderTarget::<SessionSet, UserSet> {
session_id: SessionSet(session_id),
user_id: self.user_id,
market_id: self.market_id,
marker: Default::default(),
}
}
}
impl OrderBuilder<FillModeLimit, NoSide, NoAmount, NoPrice> {
pub fn side(&self, side: Side) -> OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
OrderBuilder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: SideSet(side),
marker: Default::default(),
price: None,
size: None,
}
}
}
impl OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
pub fn amount(
&self,
size: NonZeroOrderSize,
price: NonZeroPriceMantissa,
) -> OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet> {
OrderBuilder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: self.side,
size: Some(size),
price: Some(price),
marker: Default::default(),
}
}
}
impl OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet> {
pub fn reduce_only(&self) -> PlaceOrder {
PlaceOrder {
fill_mode: self.fill_mode,
session_id: self.session_id,
market_id: self.market_id,
user_id: self.user_id,
side: self.side.0,
price: self.price,
size: self.size,
is_reduce_only: true,
quote_size: None,
client_order_id: None,
}
}
}
impl From<OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet>> for PlaceOrder {
fn from(val: OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet>) -> Self {
PlaceOrder {
fill_mode: val.fill_mode,
session_id: val.session_id,
market_id: val.market_id,
user_id: val.user_id,
side: val.side.0,
price: val.price,
size: val.size,
is_reduce_only: false,
quote_size: None,
client_order_id: None,
}
}
}
#[derive(Clone, Copy)]
pub struct OrderBuilder<FillModeType, SideType, AmountType, PriceType> {
market_id: MarketId,
side: SideType,
session_id: SessionId,
user_id: Option<UserId>,
fill_mode: FillMode,
marker: PhantomData<(FillModeType, SideType, AmountType, PriceType)>,
price: Option<NonZeroPriceMantissa>,
size: Option<NonZeroOrderSize>,
}
}
impl PlaceOrder {
pub fn validate(&self) -> Result<&Self, Error> {
match self.fill_mode {
FillMode::Limit | FillMode::PostOnly => {
ensure!(self.price.is_some(), Error::OrderMissingLimits);
ensure!(
self.size.is_some() || self.quote_size.is_some(),
Error::OrderMissingLimits
);
}
FillMode::ImmediateOrCancel | FillMode::FillOrKill => {
ensure!(
self.price.is_some() || self.size.is_some() || self.quote_size.is_some(),
Error::OrderMissingLimits,
"IoC and FoK - require at least one limit specified"
);
}
}
Ok(self)
}
}
@Veetaha enums are good, but
proto
file defined PlaceOrder which almost exact PlaceOrder in Rust (one to one mapping, with some numeric validations done in pure Rust side). we could also have directly use prost or other serde on PlaceOrder. so we cannot change it easy to enum.size_*
fields are forming some kind of groups/constraintsso groups vs custom enums is tradeoff.
but sure - enums will help in some cases. for example we build like this one:
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary, bon::Builder))]
pub struct Action {
pub timestamp: i64,
#[cfg_attr(test, builder(default))]
pub nonce: u32,
pub kind: ActionKind,
}
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary))]
pub enum ActionKind {
CreateSession(CreateSession),
RevokeSession(RevokeSession),
CreateToken(CreateToken),
CreateMarket(CreateMarket),
PlaceOrder(PlaceOrder),
...
}
Yeah, I understand enums may be unreasonable when the combinations of states would contain a lot of duplicate fields and projections of each other, and the number of variants may also be high... Therefore this feature is still needed. I'm just thinking what people usually do to solve this ptoblem today with the tools currently available. I suppose it's one of:
- Writing a complex builder with typestate manually
- Trying to reshape the model to enums if it's reasonable
- If no reasonable enums representations exist, just do a builder that validates everything at run-time
yes, yes, yes, and.
validation starts from binary data, like comparing some hash or crc and/or signature.
then goes to serde level, whatever serde provides (proto file defined is very limited), what types and conversion supported.
than goes more domain validation (enums goes here), so mapping raw data after serde and hash verification into more domain like objects. domain objects are also useful to build data which converted to serde in tests.
and than goes validation within some data/storage context, when it is required to have external data (in our case order validation requires market and tokens states from storage).
So in general there is not solution which fits, there is just ergonomics here and there all around. Bon takes some nice niche to fill somewhere around.
here is builder with groups https://docs.rs/const_typed_builder/latest/const_typed_builder/ .
- Writing a complex builder with typestate manually
- Trying to reshape the model to enums if it's reasonable
- If no reasonable enums representations exist, just do a builder that validates everythwwwing at run-time
Validation at run time can be as complex as SAT solver, for example interfield JSON validation. But when there is trust in communication like sender always provides valid data, less validation on receiver needed. In general sender can send proof that his data is according some validation criteria.
So from perspective of productivity and some form of automation of thinking, groups seems good feature bringing a lot of value to developers. No need to maintain typestate pattern manually (correct one is hard to write), no need to reshape what is hard to reshape to enums(or harmful, for example performance, data compression, versioning, SoA vs AoS - all can just say no to enums), no need to validate on receive side in trust boundaries. Just compiler runs more checks.
Ultimate answer to those who ask why builder is needed imho.
@Veetaha any ideas on impl details?
I think of next:
From const-typed-builder docs:
this implementation currently has a complexity of $O(2^g)$ where $g$ is the amount of grouped variables
That sounds horrible. I'm afraid of how easy it will be to fall into that exponential pit and how much overhead it may bring to bon
overall in terms of its simplicity and maintenance burden.
As I see such ideas appeared in typed-builder
as well, but they never came to fruition:
At which point I start to question if it's even worth it. It's probably much simpler for people to do such group checks at runtime, for which we already have a mechanism - putting a #[builder]
on a function or method.
I'd be glad to be proven wrong. Maybe there is some good subset of this feature that scales fine and solves a big chunk of the "member group" problem. For example, suppose that we just don't support at_least(N)
clause. Maybe simple requires/conflicts_with
clauses that specify just a single member. Will that allow us to avoid the exponential complexity?
Anyway, I think the way to proceed here:
compatibility with other not yet implemented features - what are these?
You can see all the features planned in the issues in this repo. I don't think there are any of them in conflict with this feature yet.
Another thing that is currently only in my mind is some kind of partial stabilization of the type state API. For example, provide some set of traits for the users so they can add custom setters/getters (basically any methods) to the builder.
It's something along the lines of https://github.com/idanarye/rust-typed-builder/issues/23 and https://github.com/idanarye/rust-typed-builder/issues/22.
In fact, maybe such a feature would actually be a better replacement for the "groups" feature. It would require people to write a bit more manual code, but probably not as much as they'd do writing such type state from scratch.
That sounds horrible.
2*8 is just 1024, no so big number. just limit number of groups and it becomes constant.
how much overhead it may bring to bon overall in terms of its simplicity and maintenance burden.
Currently bon is over engineered, I believe amount of code of macro can be 2x less with still using quote (problem is flat contexts and existing separation - it feels can do other split). Also it does not steam tokens, but allocates and concats pieces of heaps of token streams, while could do just direct streaming as code produced. Quote may also be less used for composability (will give speed and simplicity too imho).
But if to say if any interfield feature will give more complexity? Yes. Any, including manually expressing next for sure:
In fact, maybe such a feature would actually be a better replacement for the "groups" feature. It would require people to write a bit more manual code, but probably not as much as they'd do writing such type state from scratch.
groups of other kind? yeah it seems all depends what ""groups"" allow. Like if if you set this filed, must set other 2
, but cannot preventing setting up other fields. that is group
sub feature/projection.
there are various projection of """groups""" which
but they never came to fruition:
the person who did a lot in that area was not collaborative, he for some reason use codeberg to store code under some obscure licence. while he should have been directly PR into repo... also he was weird stating his algo is best or at all writing algo from scratch. there likely industrial grade solution to solve """groups""". i guess const-typed-builder
more close to things.
It's probably much simpler for people to do such group checks at runtime,
Yes. But runtime is like dynamic typing or untyped code. All static typing of Rust can be done in assembly/C just by writing unit tests. But for some reason people thing Rust is better? Just because type system checks is close to place of declaration and instantiation, not some code else where which may be even unsound.
@Veetaha
Maybe there is some good subset of this feature that scales fine and solves a big chunk of the "member group" problem.
yeah, there are projection, including which you mentioned manual approach with tuples. So for me tuples smell https://github.com/lloydmeta/frunk which is not idomatic Rust.
but projection searching which is does not grows so fast sure is thing.
like 2^g
is bad. what if there is 2^(log g)
? For 8 groups it is just 2^3 which basically is C. O(C). If g is limited all becomes C. So with log can limit to max 16g (ever nobody will want more), it is already just 2^4 variants. So exponential is kind of fine if it is log and hard limit.
Maybe simple requires/conflicts_with clauses that specify just a single member.
May be, singular requires/conflics allows for chains(hierarchical) of requires/conflicts which may be easy to solve.
And yeah, need to find what is most useful alternative to refactoring to enums
2*8 is just 1024, no so big number. just limit number of groups and it becomes constant.
Note that the documentation in const-typed-builder
mentions that the N
in 2 ** N
is the number of members in the group (not the number of groups).
Currently bon is over engineered, I believe amount of code of macro can be 2x less with still using quote (problem is flat contexts and existing separation - it feels can do other split). Also it does not steam tokens, but allocates and concats pieces of heaps of token streams, while could do just direct streaming as code produced. Quote may also be less used for composability (will give speed and simplicity too IMHO).
I don't understand how you'd decrease the amount of code by 2 times while preserving the existing features.
Regarding "flat contexts and existing separation - it feels can do other split", what do you mean by "flat context", and which separation do you think of? Isn't another split going to increase the amount of code (not decrease it)? The approach of using quote for composition is the go-to common approach that is used by basically every proc macro crate, and I don't see any problem with that. What would this "direct streaming" even look like?
what do you mean by "flat context", and which separation do you think of?
I mean there is exists some kind of flat split of code(ctx, non orthogonal things), could merge all code together, shuffle it, may by using quote! less, and than unmerge back into more like graphy/orthogonal/composeble state which gives less code and more asier to compose in features.
That is what I did and others in ElastichSearch/SQL queries/builders/DSL if to code in F# style for example. quote! kind of also hinders some ability to compose.
What would this "direct streaming" even look like?
There are several concats
of tokenstreams, and also I added one for getters.
While could separate stages, one is just assemble one big graph, and than traverse it streaming into TokensStreams, never concat. Several stages.
I would not tell you how exactly it can be done, I just did it in various places similar things, it just seems visibly applicable here. I understand why people write proc macro in flat non streaming way
, may be similar reason people write Go, not Rust.
UPDATE:
There is essence of builders and getsetters, which may be essences from code and than some well know combinations of code transformations summaries, in the end there will be graph, which when transverse just generates one single token stream.
While could separate stages, one is just assemble one big graph, and than traverse it streaming into TokensStreams, never concat. Several stages.
Yeah, I understand that it's possible to add yet another layer of type models, whereby the code will construct this graph structure (like an execution plan in terraform) and then convert it all into a TokenStream
in a single pass. The current BuilderGenCtx
is that one structure but it is intentionally quite "flat", because that is good enough for the current state of things, and for the rest raw TokenStream
works fine. If you need any more strict typing then syn::parse_quote!()
can be used to operate with the syn's typed AST instead of quote!()
.
All-in-all any suggestion to perform a split will increase the code size and not decrease it because it adds more abstractions, and thus more type definitions, more functions and more code in general. So the ideas to "cut the code size by 2" and "do a split" contradict each other.
Keeping the number of types and models to the exact required minimum is what I think for the better and simpler to understand code
Allow for groups and conflicts like in clap. Also saw something like that in compile time builder, cannot find it now, as I recall it needed some solver to be run at build time.
These features allow to state:
I have my manually crafted builders and do:
Originally posted by @dzmitry-lahoda in https://github.com/elastio/bon/issues/85#issuecomment-2336491289
A note for the community from the maintainers
Please vote on this issue by adding a 👍 reaction to help the maintainers with prioritizing it. You may add a comment describing your real use case related to this issue for us to better understand the problem domain.