marshmallow-code / apispec

A pluggable API specification generator. Currently supports the OpenAPI Specification (f.k.a. the Swagger specification)..
https://apispec.readthedocs.io/
MIT License
1.18k stars 177 forks source link

Enum by_value creates OpenAPI property without type #875

Closed zedrdave closed 10 months ago

zedrdave commented 10 months ago

If using a native Enum:

import enum
class FoobarEnum(enum.Enum):
    FOO = "foo"
    BAR = "bar"

then using it in a schema:

foobar = fields.Enum(FoobarEnum, by_value=False)

We get the correct OpenAPI JSON:

                  "foobar": {
                    "type": "string",
                    "enum": [
                      "FOO",
                      "BAR"
                    ]
                  },

But if we set by_value to True, we get:

                  "foobar": {
                    "enum": [
                      "foo",
                      "bar"
                    ]
                  },

The missing "type": "string", trips up Swagger, which does not display the enum contents…

This seems potentially related to https://github.com/marshmallow-code/apispec/pull/807, but I can't see anything in the code that would lead to this difference (I can't quite figure where type is inferred)…

zedrdave commented 10 months ago

@sloria if you are able to point me in the direction of the code that defines how that type is decided, I would be happy to dig further and hopefully submit a PR

sloria commented 10 months ago

@zedrdave thanks for volunteering!

i believe the relevant method is field2type_and_format, which takes a marshmallow field and returns the corresponding OAI type and format as a dict.

relevant test is here: https://github.com/marshmallow-code/apispec/blob/fb5857088b6241bd74239a8723691997f801da06/tests/test_ext_marshmallow_field.py#L294

Bangertm commented 10 months ago

A couple more details on how the type is decided:

For enum fields there is enum2properties. This is indirectly calling field2type_and_format with the enum field's field.field property. In the case where by_value is True, field.field is a marshmallow.fields.Field instance. This class is set to have no type property in the field mapping: https://github.com/marshmallow-code/apispec/blob/fb5857088b6241bd74239a8723691997f801da06/src/apispec/ext/marshmallow/field_converter.py#L37 That's typically because apispec can't infer the type of a generic field.

Not sure what the proper behavior would be in this case because I don't have any experience working with enum fields.

zedrdave commented 10 months ago

@Bangertm thanks for diving into this further. I must admit I'll need a bit of time to potentially wrap my head around that field.field thing… I don't quite understand why it would be Field in the by_value = True case, and not when False

But basically the behaviour we want to achieve is: type = "string" no matter what, if the Field is enum… It seems like adding a simple ret["type"] = "string" on line 515 would fix it, but I suspect this is not the cleanest way to do it…

zedrdave commented 10 months ago

@Bangertm OK, I think I understand better, and as best as I can tell, this is an upstream issue with Marshmallow setting field to Field() in the by_value case instead of string otherwise (not sure I understand the reasoning): https://marshmallow.readthedocs.io/en/stable/_modules/marshmallow/fields.html#Enum

Short of this being changed upstream, it seems like the only option would be to correct it inside enum2properties as I suggested above, maybe around line 518…?

zedrdave commented 10 months ago

FWIW: passing by_value=fields.String instead of by_value=True does fix the problem… But I am not sure it makes sense ever omitting the string type for Enums in OpenAPI, so maybe checking for by_value is True and setting it if missing, would be a good way to solve this?

lafrech commented 10 months ago

I don't remember the whole story, there may be retro-compatibility involved, but I think the rationale in marshmallow is that when using by_value=True, without any more information about the value type, marshmallow can only use Field and hope the value doesn't need any serialization specifics. Any non-trivial type must pass a field class. In fact, even strings "should" probably pass a field type. Should marshmallow set String by default? Not sure.

In any case, my advice would be to set by_value=ma.fields.String when needed.

I wouldn't expect apispec to set a string type if nothing is specified. If values are integers, for instance, documenting them as string would be wrong.

zedrdave commented 10 months ago

Indeed, by_value=ma.fields.String does fix it. It seems a little weird to me that this is needed, when Marshmallow could, at the very least, infer the enum type and use that (or just assume str unless a Field is specified), but that is indeed more of a Marshmallow usability issue than apispec.

I'm not sure I see how adding a default type (if no Field is specified) would hurt:

lafrech commented 10 months ago

I understand. But I still think explicit is better than implicit. Also, I'm afraid using String rather than Field if nothing is specified would be a breaking change.

More history here if you're interested (I'm the author of the PRs adding Enum to marshmallow): https://github.com/marshmallow-code/marshmallow/pull/2017 https://github.com/marshmallow-code/marshmallow/pull/2044

We use Field by default no deserialization is needed since Enum validates the content anyway so we don't really care. And we figured people wouldn't bother setting a field for this reason, but it appears this doesn't work well for the apispec case (hence this issue).

zedrdave commented 10 months ago

I would argue there is already a strong implicit behaviour (people setting by_value to True and assuming everything will work), but you definitely have a better view on the overall problem, so I'll defer to you.

I would still recommend finding a way to warn users of that behaviour (if only by putting a note in marshmallow's Enum doc recommending to set by_value to an explicit type if possible).

Happy to close this in any case.

sloria commented 10 months ago

Upon reading this thread more closely, I agree that explicitly passing a field to by_value is the best approach. I've added a specific note about this in the docs in #876 .

Closing for now. Thanks for the discussion all!