AArnott / Nerdbank.MessagePack

A .NET MessagePack serialization library with great performance and simplicity.
https://aarnott.github.io/Nerdbank.MessagePack/
MIT License
40 stars 2 forks source link

Schema export function #131

Closed AArnott closed 3 hours ago

AArnott commented 2 days ago

There should be an API that can take an ITypeShape<T> and export a schema.

This could perhaps be a feature of PolyType instead of MessagePack, since this use case applies more broadly and for the most part the result would be the same. However, Nerdbank.MessagePack adds attributes that can alter the schema, or runtime options like name policy may alter it. (CC: @eiriktsarpalis)

Another open question is what should the format of the schema be? Should it be machine-parseable? If so, it could perhaps be the basis for other languages to import the schema to recreate type definitions in their native language to support interop. Maybe standardize on JSON Schema?

Example

Consider this set of types:

[GenerateShape]
class Tree
{
   public TreeVariety Variety { get; set; }
   public List<Fruit> Fruits { get; set; }
}

enum TreeVariety
{
   Apple,
   Orange,
}

class Fruit
{
   [PropertyShape(Name = "isRipe")]
   public bool Ripe { get; set; }
}

This would render as the following language-neutral schema, assuming a call to GenerateSchema<Tree>():

Variety: TreeVariety
Fruits: Fruit[]

TreeVariety:
   Apple = 0     # How do we disclose whether the ordinal or the name is serialized?
   Orange = 1

Fruit:
   isRipe: bool # Note the lack of the original `Ripe` property name because that is completely irrelevant for interoperability

bool:
   true = 1
   false = 0

This assumes the object graph is serialized with maps of properties. When an object is serialized as an array of values instead, the schema ought to indicate that. For example:

[GenerateShape]
class Person
{
  [Key(0)]
  public string FirstName { get; set; }

  [Key(2)]
  public string LastName { get; set; }
}

Renders as the following schema:

Person: [
   FirstName: string,
   undefined,
   LastName: string,
]

If C# properties/types are annotated with [Description("...")] we may want to include that string in the schema. JSON Schema for example has room to store such a description.

AArnott commented 2 days ago

Hmmm... Maybe this already exists as a sample I can build off of:

https://github.com/eiriktsarpalis/PolyType/blob/main/src/PolyType.Examples/JsonSchema/JsonSchemaGenerator.cs

AArnott commented 1 day ago

@eiriktsarpalis I'm a bit surprised that you didn't use the visitor pattern for the JSON schema construction. Any particular reason?

eiriktsarpalis commented 1 day ago

I'm a bit surprised that you didn't use the visitor pattern for the JSON schema construction. Any particular reason?

The primary purpose of the visitor is unwrapping type information of the type graph via its generic parameters. If there's no need for this (e.g. because property accessors or constructors aren't necessary) then using a simple pattern match/switch statement should suffice to perform the traversal.

AArnott commented 22 hours ago

I got really far with your approach. Now I'm experimenting with a retrofit wherein I ask each MessagePackConverter<T> to contribute its schema fragment. It'll better support custom converters and is more likely to be kept up-to-date than if we have just one class with 'all the answers' built-in.

eiriktsarpalis commented 11 hours ago

What type do these converters return? JsonNode presumably?

AArnott commented 7 hours ago

For the schema? They return JsonObject.