Closed mzeitlin11 closed 7 months ago
Wow! It's taken me a few days to work through this (and likely will a few more...) but I am really looking forward to getting this in.
because it's not defined correctly in stripe's own spec, the code generation doesn't correctly define stuff like https://stripe.com/docs/api/prices/create#create_price-currency_options, which should be Option<HashMap<crate::currency::Currency, crate::generated::CurrencyOption>>
edit: also cashapp is missing from the payment method enums, which i also suspect is a problem on stripe's end
edit: sorry about spamming edits, just want to document whatever issues i find; stripe::generated::price::get_prices
will throw
error serializing or deserializing a request: data[0].product invalid type: string, expected a sequence at line 18 column 17
because Price.products
is expected to be a Vec<crate::generated::Product>
when the old type seems correct, i'm not sure if this is a broader issue with Expandable
, also passing expand: Some(vec!["data.product".to_string()])
just changes the invalid type to map
, since it expects a vec
edit: it does seem to be something generally wrong with the Expandable
fields, the same vec problem is present with CheckoutSession.{customer, payment_intent, etc.}
@mzeitlin11 hi i'm interested in filling out the Expandable
generation here, can u give me some high level guidance on what needs to be done?
@mzeitlin11 hi i'm interested in filling out the
Expandable
generation here, can u give me some high level guidance on what needs to be done?
Hi @databasedav, good timing, was just hoping to start getting back to this soon! I need to refamiliarize myself with what's going, rebase, and clean stuff up. Will probably be hard to work off this branch since lots of changes still required that will determine how Expandable
(and some related missing pieces) best handled. Will let you know when this ends up in a state where tasks like implementing Expandable
can reasonably be split out.
Status Update:
Main outstanding TODO's:
x-stripeResource
, so there is now some natural parent / children structure between different schema components, making this easier. This also keeps the import structure clearer and the types shorter since they are namespaced within modules.external_account
should placed with account
since it only requests nested paths starting with account
.Paginable
for the parameters to a request, e.g. ListAccount
. However, this means that we need to implement Paginable
for a bunch of different List*
params, where what really matters is the return value, not the parameter type. Paginable
for List<T>
where T: Object
. This would automatically handle paginable cases we currently don't handle, for example retrieving an Account
comes with a List<ExternalAccount>
that should be paginable.miniserde
work is included under a min-ser
flag. Some needed codegen improvements are still left as todo!()
for generating Deserialize
implementations (derives are impossible for cases like the enums with fields since the technique of using object
key as a discriminant doesn't work naturally with the miniserde
approach).title
field. Within a single request file, it should be possible to deduplicate groups of parameters that have the same title
field and an equivalent inferred type.APIVersion
which is generated as a parameter to a single webhook request, but we also want to be importable from the top level so it can be used in the client.Hey! Thanks for continuing on this. I am back from holidays today :surfer: and plan on addressing all your questions with some ideas.
Ok! I will have comments about the openapi crate as well soon...
I am going to find some time to update my code-splitting PR and see if I can apply the same analysis to this PR and see if there are any clean boundaries between modules.
hacks around inferring which id fields should be used as cursors
I think this is OK, we have enough hacks in the previous impl so I don't mind leaving this and coming back to it later if necessary since the scope of the 'damage' is probably quite limited.
It seems more natural to implement Paginable for List
where T: Object
That makes sense to me. Happy to brainstorm implementations if you want.
Great! Also happy to help prototype on this. My intuition says it will be fairly straight forward to just spit out a manual trait impl for enums.
There are a few interesting parts here, but I think this should be tackled later. Some points to consider:
Crates
I am going to find some time to update my code-splitting PR and see if I can apply the same analysis to this PR and see if there are any clean boundaries between modules.
Sounds great! Will push up an initial implementation of this soonish (hopefully), but that will be more about setting up the structure of having the codegen understand different crates and crate deps than finding smart module boundaries. Initial efforts gave some annoying cycles like in your PR, due to cycles from cases like
pub enum BalanceTransactionSource {
...
IssuingAuthorization(crate::issuing::authorization::Authorization),
IssuingDispute(crate::issuing::dispute::Dispute),
IssuingTransaction(crate::issuing::transaction::Transaction),
...and many more
}
combined with
pub struct Authorization {
...
/// List of balance transactions associated with this authorization.
pub balance_transactions: Vec<crate::balance_transaction::BalanceTransaction>,
...
}
Without ending up with just a giant core
type crate to break the cycle, the other solution I can think of would be to define the shared base module types in some core
crate, but keep the requests in separate crates. e.g. in this case BalanceTransactionSource
and issuing::authorization::Authorization
would live in core
, but the issuing
related requests would live in their own crate (since most of the code bloat comes from the requests).
The annoyance with that (other than more codegen complexity) is that right now requests are namespaced as
impl issuing::authorization::Authorization {
}
which would have to change if issuing::authorization::Authorization
lives in a different crate
EDIT: this kind of approach would also have the benefit of solving the webhook deserialization problem by keeping all the event types in the same crate so webhooks could be enabled without requiring everything else
Nice. There is of course also the possibility of not splitting at all, since the code size with miniserde is so so much better, but let's give it a shot and see if we can find something that works.
Added some code messing around with type deduplication within files. Definitely needs cleanup / docs (as does everything else), but wanted to include a POC since it influences compile time. Need to finish off miniserde
stuff to then see what compile time looks like and whether better crate splitting is necessary. (probably requiring an idea like mentioned in the bottom of https://github.com/arlyon/async-stripe/pull/314#issuecomment-1599305824)
Could also introduce granular features to help compile time pretty easily (since we have ideas of dependencies and features could just gate the related mod requests
). e.g. "apple_pay_domain" would be a feature to enable just those specific requests. Only concern is that too many granular features makes library less usable since need to look through many features to find what needs to be enabled (and probably need to autogenerate some good feature table for the docs).
Related to https://github.com/arlyon/async-stripe/pull/314#issuecomment-1599305824, cc @arlyon if you have any thoughts about the stripe::Client
- I know you've had some ideas about changing things with an MSRV that allows GAT's. Depending on the structure of that, might affect how we want to generate requests. But regardless of that, curious of your thoughts on current pattern of
Account::create(&client, param_struct)
vs. more of a builder pattern like
CreateAccount::new(required_params).param1(value).param2(value).send(&client)
, more like how https://github.com/awslabs/aws-sdk-rust does things. Advantages of the former seem to be that it's more consistent with stripe libs in other languages, but the main annoyance is that it forces the Account
requests to be defined in the same crate as the Account
type signature, making crate splitting harder. Builder pattern also helps with something like https://github.com/arlyon/async-stripe/issues/380
Hi there! My company uses async-stripe in our backend. We're only using a small fraction of the Stripe API. So, I'd be happy to test out migrating breaking changes for a realistic use-case (I suspect there's a fair number of other users who also only use a small part of the API).
IMO, breaking changes are OK as long as there's a good migration guide. I think Clap 4.0 had a great migration guide as did Dropshot 9.0 here -- explaining how to migrate is good, breaking changes have to happen eventually. And especially if there's a good reason to make them -- I think many users would be thrilled by the compile time improvements, even if it requires some manual migrations.
If this PR is in an OK state, I'd be happy to start a WIP branch on my work's backend and report what my migration experience was like.
@adamchalmers thanks for volunteering to help out. I think the general structure of the codegen is mostly there. We (well, @mzeitlin11) may move some things around internally to try to break the code up better and reduce the size of the compilation units and, as you saw, there is still the open question about which syntax to use for generating the requests.
If you are OK with these changes down the line then go ahead. Some feedback from an integration branch would be great.
Speaking of which @mzeitlin11 do you think it would be handy to schedule a synchronous call soon to hash out some of the remaining items? You can book something here https://usemotion.com/meet/alexander-lyon/meeting
Started work on it. One note so far is that this PR requires time-core
<= 0.1.0 due to MSRV.
https://github.com/mzeitlin11/async-stripe/blob/types/async-stripe/Cargo.toml#L110-L111
But the time
crate as of version 0.3.16 has required time-core
= 0.1.1, so to integrate this branch, I've had to downgrade my project to time 0.3.16 from the latest version (0.3.22) and ensure it stays there.
I'd suggest allowing users to use time-core 0.1.x in the final PR.
Booked time for next week!
Crate splitting is now implemented with this initial structure:
stripe_types
is a special case which contains base types like Currency
, id generation, and all the core types which must be included to avoid circular dependencies. This approach helps a ton with compilation pipelining, because stripe_types
has no dependency on the client, so can start compiling as soon as serde / serde_json
are ready. Having no requests in stripe_types
adds the complexity that, for example, a type like Account
lives in stripe_types
, but the requests related to Account
live in stripe_connect
. This should be able to made opaque to the user, however, by just re-exporting stripe_types::Account
in stripe_connect
. In theory, stripe_types
should never have to be a user dependency at all.
The majority of the code (so largest compile time costs) then lives in the other generated crates since the request-related structs are the majority of what we generate. Since requests don't depend on each other, feature-gating different API resources should be straightforward, without having to worry about incompatible feature sets. Once those request-related features are added, the compile time should shrink toward how long stripe_types
takes to compile since most users use only a small bit of the giant API.
inner_classes
and in_class
(used a bunch for crate inference and organization) were gone :(. So some more work needed to get this back into functional shape for the latest spec version.AccountObject
which are single-variant enums only used as discriminants with serde(untagged)
. (ideally these object names would allow using tagged deserialization as well).ApiError
from stripe_types
, so this type can be used instead of the manual version, ideally fixing https://github.com/arlyon/async-stripe/issues/384stripe_connect::account::create(&client, CreateAccount::new())
- requiring both the import of the function create
and the CreateAccount
parameters. It feels more user-friendly to instead have the API call be something like CreateAccount::new().send(&client)
.stripe_connect, stripe_payment
) in which there are requests they want to use. It would be more complicated on the codegen side, but the main async-stripe
crate could instead reexport these crates as necessary based on features, so users only need to depend on async-stripe
.miniserde
stuff was ripped out. The main issue I ran into is that untagged enums are at odds with how the miniserde
Deserialize
trait is defined. So initial implementations could use miniserde
as a tool for deserializing untagged enums (https://github.com/arlyon/async-stripe/pull/314/commits/64eb517ea82129718aa2e537d88efa7a272f83ec#diff-806d0bc9794937ef7af03d15d2438671d3568ed8f3754bcac57ca1c9bc7c71afL12-L39), but not actually impl the miniserde::Deserialize
trait, which meant that any struct including an untagged enum could not derive miniserde::Deserialize
. So further work could probably figure out solutions, but there would likely need to be significant codegen changes where the code generation has to generate deserialization code, instead of relying on derives in most cases.Pagination added back (closes #283, closes #246).
Slightly different organization, now implemented on both List*
parameters and an existing List<T>
. So instead of
let params = ListCustomer::new();
let stream = Customer::list(&client, ¶ms).await.unwrap().paginate(params).stream(&client);
can do
let stream = ListCustomer::new().paginate().stream(&client);
And can also paginate any List<T>
received, e.g.
let account = RetrieveAccount::new().send(&client).await.unwrap().
// We have one page of `external_accounts`, get the rest:
let external_account_stream = account.external_accounts.into_paginator().stream(&client);
Added back webhook handling in new crate stripe_webhook
. Adding another crate annoying in some ways, but issue with keeping in async-stripe
is that dependencies look like stripe_webhook -> generated/* -> async-stripe
, so adding webhook handling in async-stripe
would create circular dependencies since the client cannot depend on the generated types crates.
Also migrated the examples into standalone example binaries in examples/
. Those also cannot live in async-stripe
anymore since that crate essentially only contains client functionality now, but knows nothing about specific requests.
This is readyish for review / testing (With caveat that these changes require much more testing / docs before being ready for production. @adamchalmers any feedback on issues / annoyances using for real-world use case would be super helpful).
I'd still plan to make more smaller changes around naming + deduplicating some generated types (one example right now is recognizing shared types between component definitions and requests, e.g. AccountType
and CreateAccountType
should be deduplicated. And reexporting types from stripe_types
should still be done so users don't always have to rely on IDE / docs to figure out whether a type like Account
lives in stripe_types
or the relevant crate (stripe_connect
).
Also if there are any opinions on MSRV - once that's settled can remove the time-core
pin (and hopefully other pins) and start getting CI running on these changes.
Built(ish) on #311. This initially started as an attempt to better types involved in the codegen, but turned into more of a codegen refactor to handle (almost) all of the OpenAPI spec.
form/multipart
format)Also, all the ext files are no longer needed!
Left to do in this PR
Add back handling for bespoke prefixed ids. These ended up presenting some complexity since they are updated manually, so don't fit as nicely into the more automated codegen flow.
Then can add back traits for pagination handling
Add back constructor methods (
new
) for parameters, figure out how to handle lifetimes / borrowed data without making type definitions more complexDocument everything, make any edge cases clear
Currently ignoring
Delete
requests with body params. Instead of the currently separatedelete
anddelete_query
methods, would be nice to combine thedelete / post
functions into some builder style system so that delete and post can be handled identically.Making sure doc comments are picked up where possible
Add tests for different categories of requests we generate that weren't previously covered
Looking into file organization (also relevant to questions below). Right now we generate a single file per top-level component in the spec because it makes dependency tracking straightforward (dependencies only introduced through refs in the schema). However, because lots of these components have long names, we start generating items with very long names. I hope that the
x-stripeResource
can be used to organize around a structure whereinner_classes
can just be in a module with their parent, making the structure much clearer (along with crate-splitting / features). For example,account
and friends could becomeinstead of the current
. We could then avoid exporting everything from
resources
and have some namespacing.Solve open questions below!
Open Questions
Handling Breaking Changes
Features / crate-splitting
--graph
argument to the codegen which writes a graphviz-compatible outputs of the component dependency graph instead of generating code (but that will hopefully become much clearer if the file organization mentioned in TODOs is implemented. Right now it includes all components so is very hard to learn from).My hope is that we could define features along the lines of feature
x
includes objectsaccount
,customer
, featurey
includes componentbalance, ...
. Dependency tracking would then enable understanding which features necessarily enable other features to fix #137. With a split into crates, this would make clear what cross-crate deps must need to be.Hopefully, circular dependencies could be handled by just mandating that some specific items live in some crate
stripe-core
.Improve handling of unions
With respect to the compile time pains mentioned above, another solution is to continue exploring usage of something like
miniserde
to improve compile times. Right now the unions are the main deterrent to trying this since they are not supported. Currently, we usedserde(untagged)
for these, but I think for most of these unions of object cases we could use theobject
field as a discriminant which hopefully allowsserde
to write more efficient (and smaller) implementations with better error messages. On theminiserde
front, this would allow a reasonable automated generation of adeserialize
impl through a structure likeWhen writing other unions, we often end up with a structure like
This happens because we only generate simple enums without fields or enums with fields. It would be great to simplify this representation into
which would generate less code (hopefully help with compile time on its own) and make implementing
Deserialize
for unions easierThe other common union pattern that appears in the OpenAPI spec is a union of something like
Account
andDeletedAccount
(which is returned by deleting account and some fetching methods). This was previously handled by adding adeleted
field to top-level components with aDeleted
counterpart and hardcoding the return value of aDELETE
request as aDeleted<>
struct parameterized by an id type. This doesn't seem to match the spec however - anAccount
should never have adeleted
field and the return value of a deleting an account can also be theAccount
itself, theDeleted<>
loses a bunch of information and forces hardcoding the request return value. However the automatically generated representationis a breaking change. Deserialization could be done easily by using presence of the
deleted: true
field, but having to match on variantsNaming request functions
Naming requests is now harder since handling everything means fewer assumptions can be made. For now, this PR just names based on the request "operationId" field since that ensures unique names, but it gives pretty verbose / unexplanatory names. Not sure of best solutions here, best ideas I have are to perhaps just have some simple heuristic-based naming for common patterns (create / delete, etc.) then maybe just manually provide name mappings for other requests? (a pain, but there really aren't that many requests).
Handling duplicate generated content (but different name)
The OpenApi schema can only avoid duplicating content by using reference links. Especially in the case of request parameters, this is not done.
For example, the enum specifying webhook enabled events (which is huge) is specified in 2 separate parameters. The content is identical, but since they are not linked in the spec, they are created for each parameter with a different name following the spec. This was a problem as well before, but some cases were explicitly handled by adding a manual ext file and a mapping of field name / object to that specific type name. This works, but doesn't generalize and has the added issue that manually written types can fall out of date easily (for example the
ApiVersion
enum).Since this PR introduces strict definitions of the content of types we generate, we can detect duplicates like these. The question becomes
(more questions/edits coming soon :))