stripe / openapi

An OpenAPI specification for the Stripe API.
MIT License
393 stars 123 forks source link

POST /v1/subscriptions missing a required parameter? #127

Open stepcut opened 1 year ago

stepcut commented 1 year ago

According to this documentation,

https://stripe.com/docs/api/subscriptions/create

When doing a POST to /v1/subscriptions the customer and items parameters are both required.

But in spec3.json it merely specifies,

               "required": [
                  "customer"
                ],

Which is correct -- the spec or the documentation?

remi-stripe commented 1 year ago

@stepcut there are legacy integration paths that exist on that API that still work but are undocumented. So in a way both are kind of correct. The openapi spec is "technically correct" because our API does support paths where items is not passed and it still works. The API Reference is "conceptually correct" in the sense that it's what we expect most integrations to do.

As a developer, you should assume that items is always required and that the API Reference reflects the correct behaviour to follow. But if you were building a validator for our overall API then you would have to know that items is not always required (though we don't really tell you what else is allowed anymore).

I'll see internally if we can just change the openapi spec to reflect the newer default behaviour as it's been over 5 years since we introduced items but we might not want to change this for now.

I hope this helped clarify the overall state but let me know if not.

stepcut commented 1 year ago

Thanks! This is helpful.

In case you are curious -- here is more information about what I am doing.

First off, I'd like to say that I am really impressed by your legacy integration. I am using a Haskell Stripe binding that uses a 2014 version of the API and it still mostly works. But obviously we should upgrade to a more recent version of the API.

The old version of the Haskell bindings predate the OpenAPI spec and I am working to modernize the bindings.

My initial hope was that I would be able to generate bindings directly from the spec -- but the deeper I get into the project, the less that seems reasonable.

It sounds like the most up to date information is still only available as human readable documentation?

As this binding generator progresses, it contains more and more knowledge that I have encoded by hand, and the spec is often used just to verify that my bindings are still current.

In this case, my bindings will need to manually encode the fact that both CustomerId and items are required. But the code generator will also need to see if any new values get added to the required list in the spec and throw an error.

I am willing to document the various obstacles I have run into trying to generate a binding directly from the specification, but I suspect that the changes which need to be made to the spec to accommodate that are too significant to actually make it on the project plan.

remi-stripe commented 1 year ago

@stepcut Oh that sounds really interesting! The openapi spec is mostly correct and all our client libraries/SDKs today are generated from that exact openapi spec too. We do have some manual "overrides" to try and remove/deprecate old stuff we want to get rid of but it's quite rare. Have you tried looking at the spec we have specifically for our client libraries/SDKs which is spec3.sdk.yaml (or the JSON version)?

Overall, we do recommend avoiding encoding things like this in your library for now. If you look at our own type definitions in stripe-dotnet or stripe-node for example we don't really enforce this in our types signatures and it works fine.

richardm-stripe commented 1 year ago

@stepcut I'm curious, are you building a community library and your vision is to supplant e.g. https://hackage.haskell.org/package/stripe-haskell? Or are you building it mostly for use in your own project?

If you are building a community library I would strongly second Remi's suggestion that you trust the OpenAPI spec and avoid encoding knowledge by hand. As Remi said, the OpenAPI types might appear to be more permissive in some places than they should be, but in each of those cases there is some scenario or edge case that prevents us from using a stricter type, so if the library is for general use you wouldn't want to preclude those.

As Remi hinted by mentioning our "overrides" that we have for the generators of our official SDKs, if you do want to make customizations to the types described by OpenAPI -- my advice is to express those as modifications to apply either to OpenAPI before your generation code runs, or to apply to a representation of your generated output immediately before you render it, vs. what you seem to be describing -- manually making changes to the generated output and then checking it against the spec to find inconsistencies.

In any case, please don't hesitate to reach out to us here to ask us for advice/feedback/any sort of guidance! We are excited you are working on a Stripe library and helping community libraries be successful is a goal of ours. I personally am quite fond of Haskell, too!

stepcut commented 1 year ago

I am a co-author of that stripe-haskell library and the goal is to bring it up to date. That library predates the OpenAPI spec, so everything had to be handcoded when it was originally written.

I have been using spec3.json since the README.md says that spec3.sdk.json is for Stripe-internal use and that users should use spec3.json instead. Should I ignore that advice and use spec3.sdk.json?

Both spec3.json and spec3.sdk.json seem to lack a lot of useful information.

For example, the PostSubscriptions specification defines the optional parameter,

                  "default_tax_rates": {
                    "anyOf": [
                      {
                        "items": {
                          "maxLength": 5000,
                          "type": "string"
                        },
                        "type": "array"
                      },
                      {
                        "enum": [
                          ""
                        ],
                        "type": "string"
                      }
                    ],
                    "description": "The tax rates that will apply to any subscription item that does not have `tax_rates` set. Invoices created will have their `default_tax_rates` populated from the subscription."
                  },

This says that default_tax_rates is an array of any strings. But, I highly doubt that is true. It is almost certainly expecting strings which look like "txr_1NdZcI2eZvKYlo2Cx4QEfiLG"?

I have looked at the official client libraries and they do seem to give the users a lot of rope to hang themselves. If I was to model my library after the existing libraries then a user wanting to use createSubscription might write code that looks a bit like this:

myFunction :: StripeConfig -> Text -> Text -> Text -> IO ()
myFunction config custId priceId taxRateId =
    do result <- stripe config $ createSubscription -&- ("item", [ ("price", priceId) ])
                                                    -&- ("default_tax_rates", taxRateId)
                                                    -&- ("collect", "currently_due")
       checkResult result

main :: IO ()
main =
    do let config    = StripeConfig (StripeKey "secret key")
       custId    <- <code to get customerId>
       priceId   <- <code to get priceId>
       taxRateId <- <code to get taxRateId>
       myFunction config custId taxRateId priceId

While that code might compile it is riddled with errors:

  1. The user forgot to pass in the required customer field to createSubscription
  2. The user misnamed the field item instead of items
  3. The user passed a single tax_rate to default_tax_rates instead of a list of tax_rate
  4. The user mixed up the order of the arguments when calling myFunction and swapped taxRateId and priceId
  5. The user added the collect parameter -- which is not actually a parameter for createSubscription

Each of these errors would result in a runtime error and a tedious debugging cycle. But eventually they would get it work. So by some definitions it 'works fine'.

However, in the old handwritten version of the Haskell Stripe library, we can detect all these mistakes at compile time by making use of the type system.

myFunction :: StripeConfig -> CustomerId -> PriceId -> TaxRateId -> IO ()
myFunction config custId priceId taxRateId =
    do let items = Items [ item { price = priceId } ]
       result <- stripe config $ createSubscription custId items
                                                    -&- DefaultTaxRates [taxRateId]
       checkResult result

main :: IO ()
main =
    do let config    = StripeConfig (StripeKey "secret key")
           custId    <- <code to get customerId>
           priceId   <- <code to get priceId>
           taxRateId <- <code to get taxRateId>
       myFunction config custId priceId taxRateId

In this more Haskelly version:

  1. The required parameters are actual arguments to createSubscription.
  2. field values have constructors like Items and DefaultTaxRates -- so the compiler will complain if you make a typo
  3. the type system knows that DefaultTaxtRates takes a [TaxRateId] as a parameter and will complain if you don't give it a list
  4. The various ids are wrapped in newtypes so you can't accidentally mix up PriceId and TaxRateId
  5. The type system knows which parameters are allowed for each operation and will complain if you add invalid parameters

The Haskelly version adds a ton of type safety but is just as concise.

Unfortunately, even in spec3.sdk.json is seems there is no way for the code generator to know that default_tax_rates is specifically a list of tax_rate ids and not just any old string. So instead we have to either handcode that information into the code generator, or burden all the users to read the Stripe documentation and figure that out. However the API documention merely states:

default_tax_rates optional

The tax rates that will apply to any subscription item that does not have tax_rates set. Invoices created will have their default_tax_rates populated from the subscription.

So even reading the documentation does not make it clear what format is expected for default_tax_rates.

In the Haskell Stripe library, we can encode the fact that the default_tax_rates is a [TaxRateId]. So when users see our documentation, there is no confusion about what is expected.

Our code generator can check that the definition in the specification hasn't changed. If the spec does change, then we have to verify if the changes are compatible or not and potentially update the code generator.

The spec also has this curious enum which has no actual enumeration values,

                  "default_tax_rates": {
                    "anyOf": [
                      {
                        "items": {
                          "maxLength": 5000,
                          "type": "string"
                        },
                        "type": "array"
                      },
                      {
                        "enum": [
                          ""
                        ],
                        "type": "string"
                      }
                    ],

I think what that means is that you can pass in an empty string instead of an empty list and it does something different?

One thing I would love to see in the spec is that ids like tax_rate_id are defined in the components section -- perhaps a bit like this,

"components": {
    "schemas": {
      "tax_rate_id": {
          "description": "Unique identifier for the tax rate object."
          "maxLength": 5000,
          "type": "string"
          }
     }
 }

And then referenced everywhere they are used like this:

                  "default_tax_rates": {
                    "anyOf": [
                      {
                        "items": {
                          "$ref": "#/components/schemas/tax_rate_id"
                        },
                        "type": "array"
                      },
                      {
                        "enum": [
                          ""
                        ],
                        "type": "string"
                      }
                    ],

This would also help clarify endpoints like this one,

    "/v1/application_fees/{fee}/refunds/{id}": {
...
          {
            "in": "path",
            "name": "fee",
            "required": true,
            "schema": {
              "maxLength": 5000,
              "type": "string"
            },
            "style": "simple"
          },
          {
            "in": "path",
            "name": "id",
            "required": true,
            "schema": {
              "maxLength": 5000,
              "type": "string"
            },
            "style": "simple"
          }

what is fee? what is id?

This seems a lot more useful,

    "/v1/application_fees/{fee}/refunds/{id}": {
...
          {
            "in": "path",
            "name": "fee",
            "required": true,
            "schema": {
               "$ref": "#/components/schemas/fee_id"
            },
            "style": "simple"
          },
          {
            "in": "path",
            "name": "id",
            "required": true,
            "schema": {
               "$ref": "#/components/schemas/fee_refund_id"
            },
            "style": "simple"
          }

With these changes, you could still generate the current bindings for Python, Node, etc, which make heavy use of untyped strings everywhere. But it would also make it far easier to create bindings that make better use of the type system in languages that have reasonable type systems.

tl;dr -- should I ignore the advice in README.md switch to spec3.sdk.json instead of spec3.json ?

richardm-stripe commented 1 year ago

This says that default_tax_rates is an array of any strings. But, I highly doubt that is true. It is almost certainly expecting strings which look like "txr_1NdZcI2eZvKYlo2Cx4QEfiLG"?

This is an excellent point and thank you very much for the suggestion. We haven't put this information in the OpenAPI spec, probably because none of our target languages have anything ergonomically similar to Haskell's "newtype", but come to think of it we could use template literal types in Typescript.

Some of our official libraries do use types to prevent some of the problems you describe. We have Typescript in stripe-node now, which prevents field misnamings, mistypings, captures requiredness, etc. Stripe-Java has typed params via builder methods now (this may not have been the case when stripe-haskell originally cropped up), which can't help with requiredness but does prevent most of the other problems. Our dynamically-typed languages still have a lot of problems that you describe but we're working on introducing gradual types, which will help.

In any case, with the exception of id types more specific than string, our Typescript bindings provide all of those safeguards, and we generate that from spec3.sdk.yaml so that should indicate that everything you need at least to get that level of type-safety is in there.

The reason we have that disclaimer for spec3.sdk.yaml is because we don't offer any stability guarantees there, and we've done some pretty significant breaking changes to the metadata that's in that file over the past few years. Most of what's in there beyond what's in spec3.yaml is

  1. Information we use to name classes and types.
  2. Information we use to group API methods onto resource classes.
  3. Information about what fields are "expandable".

A lot of this you may be able to do without. You might be able to name and group methods solely by the HTTP paths, for instance, but we need that information because we hold ourselves pretty heavily to backwards-compatibility.

The spec also has this curious enum which has no actual enumeration values.

We call that "emptystringable" or "emptyable". anyof("", X) indicates that you can send "" to the API in order to "empty" the value there. null would be nicer, but Stripe speaks formencoded, rather than JSON.

should I ignore the advice in README.md switch to spec3.sdk.json instead of spec3.json ?

If you find a use for it you can, but just keep in mind there's no stability guarantee, so you're signing up for work if/when we iterate on the metadata there. One way to kind of protect yourself from changes there -- and what we do in our own generators -- is we immediately parse OpenAPI into our own internal representation that represents exactly the information we need to generate the library in a much more convenient form.

richardm-stripe commented 1 year ago

Also, would you be willing to open up a new issue (with the "feature request" label) asking for more granular "id" types? You can retitle this one, too.

richardm-stripe commented 1 year ago

Oh, @remi-stripe has reminded me that

template literal types in Typescript.

is NOT actually a good idea because we actually do change the prefixes sometimes and in some cases allow users to specify their own IDs. What you are suggesting still works though -- maybe starting with txr_ isn't sufficient to prove that a string is a tax rate id, but appearing on an API response where tax rate ids appear still is.