tfranzel / drf-spectacular

Sane and flexible OpenAPI 3 schema generation for Django REST framework.
https://drf-spectacular.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.31k stars 258 forks source link

Use qualname to disambiguate multiple components with identical names #51

Closed jayvdb closed 3 years ago

jayvdb commented 4 years ago

Mental note for https://github.com/tfranzel/drf-spectacular/issues/27#issuecomment-625652702 ;-)

tfranzel commented 4 years ago

the enum part is done. that leaves the serializer naming.

jayvdb commented 4 years ago

Oh my, the enum approach is IMO ugly. __qualname__, coupled with __module__ already gives uniqueness. I do see the value of using the hash , and merging enums with the same choices, but IMO that should be optional. I would like to opt-out, and I am happy to offer a PR implementing qualname based names real quick to avoid these hashes appearing in my TypeScript filenames and class names.

The hashing approach will merge two enums just because they are the same, which is introducing another case of disappearing enums - the old types disappear and become one. They may have the same choices at one time version of an API, but differ in a different version of the API, and it would be better for the component name to stay stable across those changes.

In addition to the merging aspect, I have enums which change choices often, and their hash would change, and client generated model filenames and class names would change each time, with flow on effects throughout the codebase.

tfranzel commented 4 years ago

Perfect is the enemy of good. :P

as you said. sry that you are dissatisfied with the solution.

We only need a default behaviour for the fully automatic schema generation that is safe/sane most of the time, so to speak.

i tried to go for that.

If that separation doesnt help disambiguate two enums, numerical suffixes can be used -- my typescript code generator uses that approach as a fallback for multiple retrieve(get) endpoints

i did exactly what you suggested. choose a "stable" suffix to disambiguate instead of disappearing the enums.

The hashing approach will merge two enums just because they are the same, which is introducing another case of disappearing enums - the old types disappear and become one. They may have the same choices at one time version of an API,

the whole point of the enums is consolidating identical choices. now you want to keep them separate. can't have both ways. i'm afraid you ask for multiple things at the same time that are mutually exclusive. since the choice lists are no class objects, they have no name in itself. all you can do is use the hash of the set. the differentiation you ask for is simply not possible here.

Oh my, the enum approach is IMO ugly. qualname, coupled with module already gives uniqueness.

that would introduce duplication again. suppose you have 2 distinct enums with the name status that you use at 6 different locations. the suffix approach resolves this to 2 enums. your approach would result in 6 enums. the qualname approach will be good for the serializers, but not a good match for the enums.

In addition to the merging aspect, I have enums which change choices often, and their hash would change, and client generated model filenames and class names would change each time, with flow on effects throughout the codebase.

the enum resolution is supposed to be stable and comprehensible (safe/sane as you said). the suffixes are a way to resolve issues automatically. you can and should resolve the emmited conflicts via the settings: https://drf-spectacular.readthedocs.io/en/latest/faq.html#i-get-warnings-regarding-my-enum-or-my-enum-names-have-a-weird-suffix . if you explicitly name you conflicting enum sets, you won't see suffixes (and therefore names wouldn't change).

tfranzel commented 4 years ago

@jayvdb any feedback on using the ENUM_NAME_OVERRIDES setting? i think most (if not all) of your collision issues can be resolved with this. i also improved the setting handling a bit.

if you want to go a completely different way you can always replace the default hook with your own custom one. the POSTPROCESSING_HOOKS are configurable exactly for that purpose. I understand the approach i took is somewhat opinionated, but i believe this will do the right thing for the majority of cases.

also the workflow should not be that awkward because it is in line with rest of spectacular: run generation, see a warning, fix issue with annotation/hint/setting if desired. otherwise ignore and carry on.

jayvdb commented 4 years ago

@tfranzel , if you dont want to have an open issue about ways to automatically generating semantically meaningful component names , close this issue.

This issue is not about settings which allow the implementer to workaround deficiencies in the tool. It is great that those workarounds exist, so that people can implement it already. Pragmatism requires that. This issue is about how to correctly generate names, even in the pretense of naming conflicts by falling back to python name uniqueness. Because we can, easily in Python 3, using qualname. It used to be hell in Python 2 because qualname didnt exist so names within a module could easily clash without intensive use of inspect.

I've re-run with master, and I am now not getting generated hash-based-named enums appearing in my schema. That was a no-go for me, and meant I had to revert that commit. However, this was fixed because I removed /oscarapi/admin/ from my URLConf, effectively deferring this problem for me. But I am interested enough in it to keep exploring it, as we do intend to re-add /oscarapi/admin/ and a back-office app using it, after the dust has settled on the initial go live for the customer app. When I re-add it, here is the relevant part of the diff

@@ -4664,7 +8027,7 @@ components:
           type: string
           maxLength: 128
         type:
-          $ref: '#/components/schemas/TypeEnum'
+          $ref: '#/components/schemas/Type6faEnum'
       required:
       - code
       - name
@@ -5178,6 +8810,128 @@ components:
           multipleOf: 0.01
           maximum: 10000000000
           minimum: -10000000000
+    PatchedOption:
+      type: object
+      description: Correctly map oscar fields to serializer fields.
+      properties:
+        url:
+          type: string
+          format: uri
+          readOnly: true
+        code:
+          type: string
+          pattern: ^[-a-zA-Z0-9_]+$
+        name:
+          type: string
+          maxLength: 128
+        type:
+          $ref: '#/components/schemas/Type6faEnum'
     PatchedPartner:
       type: object
       description: Correctly map oscar fields to serializer fields.
@@ -5386,6 +9140,36 @@ components:
       - productClass
       - stockrecords
       - url
+    ProductAttribute:
+      type: object
+      description: Correctly map oscar fields to serializer fields.
+      properties:
+        url:
+          type: string
+          format: uri
+          readOnly: true
+        productClass:
+          type: string
+          writeOnly: true
+        optionGroup:
+          allOf:
+          - $ref: '#/components/schemas/AttributeOptionGroup'
+          nullable: true
+        name:
+          type: string
+          maxLength: 128
+        code:
+          type: string
+          pattern: ^[-a-zA-Z0-9_]+$
+          maxLength: 128
+        type:
+          $ref: '#/components/schemas/Type499Enum'
+        required:
+          type: boolean
+      required:
+      - code
+      - name
+      - url
     ProductAttributeValue:
       type: object
       description: Correctly map oscar fields to serializer fields.
@@ -5722,7 +9506,22 @@ components:
       - Ms
       - Dr
       type: string
-    TypeEnum:
+    Type499Enum:
+      enum:
+      - text
+      - integer
+      - boolean
+      - float
+      - richtext
+      - date
+      - datetime
+      - option
+      - multi_option
+      - entity
+      - file
+      - image
+      type: string
+    Type6faEnum:
       enum:
       - Required
       - Optional
       type: string

ProductAttributeValue is a EAV field that Oscar has embedded. c.f. https://github.com/makimo/django-eav2 for the generic component which they Oscar-ified (crippling/simplifying it to only work with one model, but probably also enhancing it).

The enum values come from https://github.com/django-oscar/django-oscar/blob/61ce21e/src/oscar/apps/catalogue/abstract_models.py#L804

I see nothing in https://github.com/django-oscar/django-oscar-api/blob/master/oscarapi/serializers/admin/product.py related to the type field, so it is being merely carried through it seems.

The naming of this enum can be disambiguated as follows:

  1. concat(model_name_or_label, field_name) : ProductAttributeType - the enum is actually defined inside this model, so that name will be consistently used for any usage in the API where the model can be deduced, or the model can be explicitly described with an extension.
  2. concat(app_name, model_name_or_label, field_name) : CatalogueProductAttributeType - needed if there are two apps with the same model name
  3. concat(name_from_serializer, field_name) : PatchedProductAttributeType - it only appears as a field in this serializer. Highly not recommended, as the same enum could appear elsewhere as the schema grows, and they wouldnt be merged together, because it wouldnt be simple to correctly determine which serializer name to prefer (maybe shortest?). This is one case where hash-names is possibly the 'best' approach if the same enum appears twice. If the enum only appears once, the serializer based name is semantically more useful in an API. When it appears twice, more semantic information is provided by the merging than by the name of the merged component.
  4. concat(name_from_view, field_name) : ProductAttributeAdminList..Type - if neither serializer or model can be found, such as a queryset which defies attempts to inspect it.
  5. concat(module_and_name_from_view, field_name) : OscarapiAdminProductProductAttributeAdminList - here is where qual name and module paths ensure uniqueness. Perhaps horridly long, and some people may prefer the hashed name approach here

Before concluding this issue is optimally solved, some analysis of the top few 'enum fields' should be done to see how they work, and ensure that optimal (or at least reasonably semantically correct) names are given to their enums.

tfranzel commented 3 years ago

If the enum only appears once, the serializer based name is semantically more useful in an API.

that is what i did with 5da017f. this may hopefully resolve at least some your enum collisions.

Type499Enum -> ProductAttributeTypeEnum

not a big fan of involving the model. imho it's same problem there when the choice set is reused in multiple models.

i also have quite a fewtype fields that only get used in one spot. if the same (field_name, choice_set) is only used in one component it can be safely prefixed with the component name without any confusion. i think that should cover at least some of your points. i just noticed i haven't handled the patched case yet. that would be still a todo.

the ugly solution is still happening when a we have the same name with different choice sets in different components.

of course component collisions are a separate topic. the enums will likely improve even a bit more when serializer component collisions are properly handled.

tfranzel commented 3 years ago

there have been some more minor improvements. given the framework limitations, this is most likely as good as it gets. closing because it's long and stale.