rs / rest-layer

REST Layer, Go (golang) REST API framework
http://rest-layer.io
MIT License
1.26k stars 114 forks source link

[jsonschema] Support custom Validators #35

Closed smyrman closed 7 years ago

smyrman commented 8 years ago

One of the strengths of the rest-later/schema package, is that users can define their own Validators. However, doing so would lead to their validators not being handled by the schema/encoding/jsonschema package at the moment.

Solution 1: Marshaler interface

One solution would be to add an interface in the schema/encoding/jsonschema package similar to what one would find in the encoding/json package from the standard libs.

type Marshaler interface {
        MarshalJSONSchema() ([]byte, error)
}

One could also chose to be a bit different, and go for an interface that allows directly passing in the writer to use:

type Marshaler interface {
        MarshalJSONSchema(w io.writer) error
}

Either way, the jsonschema package should test if this interface is implemented for any given validator, and prefer to use it if it is.

The second option might be simplest in terms of allowing users to extend existing validators. E.g. someone might write something like:

type SizedDict struct{
        schema.Dict
        MaxValues, MinValues int
}

func (sd SizedDict) Validate(value interface{}) (interface{}, error) {
        value, err := sd.d.Validate(value)
         if err != nil {
                  return nil, err
         }
         ...
}

func (sd SizedDict) MarshalJSONSchema(w io.Writer) error {
        enc = jsonschema.NewEncoder(w)
        if err := enc.Encode(sd.d); err != nil {
                return err
        }
        ...
}

If the first option is chosen for the interface, it might be beneficial to define an equivalent to json.Marshal in the jsonschema package to simplify user extensions.

smyrman commented 7 years ago

As mentioned in #51, I suggest we rethink how the end-user interface could be made easier to use.

smyrman commented 7 years ago

The biggest disadvantage with the proposed solution 1, is that the end-user must correctly handle commas, white-space, quotes and other JSON-Encoding logic. This can be tedious and error prone. After some consideration, I have thought through three different solutions, where I have landed on being in favour of the most simple, stupid one, branded "Solution 2".

As mentioned in #51 this will affect how we shold do encoding internaly as well.

Solution 2: Return map[string]interface{} (recommended)

This proposed solution involves defining a simple interface for custom validator implementations to implement:

// Draft4Encoder must be implemented by custom schema.Validator implementations in order to
// allow JSONSchema serialization.
type Draft4Encoder interface {
        // JSONSchemaDraft4 should return a map containing valid JSON Schema keys and values
        // based on the available information in the containing type. Adding of additional keywords,
        // e.g. application specific ones, is allowed, but these should not conflict with JSON schema
        // definitions.
        JSONSchemaDraft4() map[string]interface{}
}

Specifying the draft version in the function name makes sense, since there is now ongoing work on the next version of JSON Schema, but I am happy to reconsider that.

Example:

Reusing the example from solution 1, we get.

type SizedDict struct{
        schema.Dict
        MaxValues, MinValues int
}

func (sd SizedDict) Validate(value interface{}) (interface{}, error) {
         ...
}

func (sd SizedDict) JSONSchemaDraft4() map[string]interface{} {
        m := map[string]interface{}{
                "type": "object",
        }
        if sd.MaxValues > 0 {
             m["maxProperties"] = sd.MaxValues
        }
        if sd.MinValues > 0 {
             m["mimProperties"] = sd.MinValues
        }
        return m
}

Advantages:

Disadvantages:

Alternative solutions

For reference, I have considered two other solutions as well.

Solution 3: Create encoder that handles commas, delimitors and whitespace.

I got some pseudo code for this. But although it's probably more fun to implement than solution 2, and probably higher performing, it also results in a more complex solution.

When writing a custom Validator, the interface to implement would be:

type Draft4Encoder interface {
          WriteJSONSchemaDraft4(enc *ObjectEncoder) error
}

I came up wit the following public interface for streaming object and array encoders:

// ObjectEncoder allows streaming JSON key/values to an underlying io.Writer. Remember to close
// an object with Flush() once you are done streaming values. Sub-encoders can be initialized to
// stream nested objects or arrays.
type ObjectEncoder {
    // Put writes a key/value, as well as an opening bracelet `{\n` or comma `,\n`, based on
        // the current encoder  state. If initialized as a NestObject, the opening delimiter is
        // `"KEY": {`,where KEY is replaced by the value of the passed in key for the encoder.
    Put(key string, v interface{})
        // Flush closes an open object with `}\n` and return true. A non-opened object will first be
       // opened, unless omitEmpty is true, in which case false is returned and nothing is written.
    Flush(omitEmpty bool) bool
        // NestObject returns an ObjectEncoder that can be used to stream a sub-value for key.
        // Once done streaming values, close the sub-encoder via Flush. If omitEmpty is specified
        // while flushing, the key/value pair is only written if at least one value was inserted.
    NestObject(key) *ObjectEncoder
        // NestArray returns an ArrayEncoder that can be used to stream a sub-value for a key. ...
    NestArray(key) *ArrayEncoder
}

// Commentary skipped.
type ArrayEncoder {
    Put(v interface{})
    Flush(omitEmpty bool) bool
    NestObject() *ObjectEncoder
    NestArray() *ArrayEncoder
}

Advantages:

Disadvantages:

Solution 4: Define a (giant) struct with all valid keywords

We could define a struct, ala this one. However, the linked struct only holds a very small subset of the legal JSON Schema keywords, and allowing all of them would result in a pretty giant struct.

Advantages:

Disadvantages:

smyrman commented 7 years ago

As a sub-point to solution 2, it would actually be quite useful if the schema types themselves implemented the Draft4Encoder interface, as that allows user to reuse it's encoder more easily:

func (sd SizedDict) JSONSchemaDraft4() map[string]interface{} {
        m := sd.Dict.JSONSchemaDraft4()
        if sd.MaxValues > 0 {
             m["maxProperties"] = sd.MaxValues
        }
        if sd.MinValues > 0 {
             m["mimProperties"] = sd.MinValues
        }
        return m
}

However, then it's probably still better to skip "Draft4" from the name, and elt the end-users deal with any potential JSON Schema version updates. Alternatively we need a method that can replace m := sd.Dict.JSONSchemaDraft4() in the jsonschema package that returns a map that can be extended.

Keep in mind that this example is constructed just to highlight potential usage.

@rs, @yanfali, any view on which solution(s) you prefer?

rs commented 7 years ago

I think the solution 1 is the most idiomatic. For most validators the code should be straightforward. The most complex validators are the one nesting others, and those would be already available.

smyrman commented 7 years ago

One big difference with solution 1 and the json.Marshaller interface though, is that in solution 1, you are not expected to return an object with encapsulating bracelts {}, but insteads just a partial JSON Object that will get other fileds merged in from elsewhere. Perhaps that is OK...

rs commented 7 years ago

Aren't most of the validators supposed to just return what would serializeField return?

smyrman commented 7 years ago

No, it should return what validatorToJSONSchema returns.

smyrman commented 7 years ago

E.g. "type": "string", "format": "custom", "minLength": 33, "maxLength": 35

rs commented 7 years ago

Oh yes you're right. I think it's fine.

smyrman commented 7 years ago

Oh yes you're right. I think it's fine.

Alright, I was probably overthinking this one...

The most complex validators are the one nesting others, and those would be already available.

You mean one validator nesting another validator and/or Schema, such as Array, Object, Dict etc.? There might be some situations where you would want a custom object implementation etc., but as long as we can call back into the API, I guess we are fine. Which brings me to the next point...

Looking at the extension bit from solution 1 (fixed typo sd.d):

func (sd SizedDict) MarshalJSONSchema(w io.Writer) error {
        enc = jsonschema.NewEncoder(w)
        if err := enc.Encode(sd.Dict); err != nil {
                return err
        }
        ...
}

We need to change func (e *Encoder) Encode(s *schema.Schema) error to accept interface{} to make it work. This is probably not a breaking change in most cases.

Other than that, would it still make sense to try out e.g. solution 3 internally for #51?

rs commented 7 years ago

I think it's very complex. The more I think about it, the more I like the Marshaler with io.Writer. Maybe we should try a PR to see how it looks.

smyrman commented 7 years ago

Ok, let's reject solution 3 for now, also internally. It is complex.

I can do a PR for solution 1 with io.Writer when I get time. It doesn't require any big changes internally, such as the other solutions, so should be straightforward.

As for solution 2 (to return map[string]interface{}), I just (re-)realized that Swagger 2.0 supports non-JSON encodings, such as YAML, as specified here. I don't think it's a very big point to support though. I could still do a partial implementation of solution 2, if I have time, just to see how it compares to the current internal solution in terms of complexity.

yanfali commented 7 years ago

I'm not a huge fan of YAML because it's white space based and very hard to validate correctly. However if we're just treating it as a target I guess that's not terrible. If I were to choose an actual format that wasn't JSON it would be TOML, since that can be validated sanely. JSON for all it's warts can be mechanically checked for correctness by third party tools.

smyrman commented 7 years ago

I'm not a huge fan of YAML because it's white space based and very hard to validate correctly.

To be clear, I use YAML as an example as it's a supported format for JSON Schema serialization for the Swagger spec. Also, I am not suggesting adding any explicit support for other formats to the rest-layer jsonschema package.

The only support we might want to add in rest-layer itself, is to return the map[string]interface{} value, giving end-users the opertunity to serialize JSON Schema into different data formats if they so wish. Very much similar to how rest-layer's ResponseSender interface works

smyrman commented 7 years ago

This has been merged