Open gregsdennis opened 1 year ago
The more I think about it, I'm coming to the conclusion that polymorphism isn't a good approach for code-generated models. I'd like to present an argument that we don't need to support polymorphism at all with this vocabulary.
If we think about the common use cases for code generation (specifically from JSON Schema), it's generally for creating types for data given to us by APIs of some sort, whether from a web source like OpenAPI or from a local source, like a CSV. Ideally, the models we end up making should be no more than DTOs.
Also, best practice for application architecture recommends that we don't use DTOs inside the application logic. Instead we should create domain models for the critical portions of the application to use and map between the two models when we need to interact with the API.
Secondly, polymorphism is commonly used to avoid code duplication, especially when the domain (business logic) requires that the same property on different types mean the same thing. You can extract a base type from the types and define that property once on the base type.
However, given that code generation is typically making DTOs, does it matter if properties are duplicated across multiple types? I can't say that it does.
See JSON Schema support in the Manifold project for an existing implementation, most relevant to this discussion is Composition with AllOf.
Relevant discussion:
Examples of current use:
Typically, inheritance is modelled something like this:
There are no inherent relationships between the data types these schemas describe, and a codegen could simply create three disparate classes, each with all of the required properties. However, that's not the intent behind the schemas. In order for that intent to be more explicitly expressed while still allowing for normal validation to occur, I propose we add an new annotation keyword that a schema should represent a "derived" type.
derived
is a boolean that directs codegen tools to consider in-place references that appear either on their own (sibling$ref
) or in anallOf
as base types.The above would change to
Without this keyword, or with a value of
false
, codegen will be directed to create disparate types.Multiple inheritance
If we had a schema that had multiple references in an
allOf
, each reference would be a base type.for some
extension-data
schema.This, however, requires multiple inheritance, which represents a problem for many languages. For example, JavaScript and .Net don't support multiple inheritance, whereas C++ and Python do. The workaround for those that don't may be to render the base types as interfaces (i.e. type definitions with no implementation). A "derived" class can then implement any number of these interfaces. You still get the polymorphism, but you don't get a class hierarchy.
For example, in C#,
derived-3
may be generated asDerived3
can be used anywhere anIBase
orIExtensionData
could be used, and any JSON instance that validates againstderived-3
also validates againstbase
andextension-data
, so we have polymorphism in that respect.However if we need to instantiate an
IBase
, we'd also need to create a classBase
that implementsIBase
, and thatBase
class would not be polymorphic withDerived3
.It will need to be up to the tool to discern when to create base classes vs base interfaces as required by the generated language.
(This also illustrates why the JSON Schema team has historically recommended that generative logic should only be used as a developer tool, not in production. Generative logic cannot cater to every scenario, and any generated code should be verified before it's used.)
Other subschemas within an
allOf
This issue only covers
$ref
schemas insideallOf
. Subschemas could be handled as either as new types to be "inherited" from or merely as additional definition on the current type.I've opened a separate issue for this since how to handle subschemas ties in with simple objects (https://github.com/json-schema-org/vocab-idl/issues/46).
Allowing undeclared properties
Many comments on this topic seem to want to apply
additionalProperties
to the base. But this is wrong. It would mean that aderived-*
is not abase
. But in terms of inheritance, aderived-*
is abase
. LeavingadditionalProperties
off solves the problems that arise when it's there.