elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.71k stars 8.12k forks source link

[Mappings editor] Use Avj schema validator instead of io-ts #56272

Open sebelga opened 4 years ago

sebelga commented 4 years ago

Currently, we use the io-ts lib to validate the configuration of the mapping (the complete object without the properties and dynamic_templates parameters).

As I was working on the fix to support mappings type in index template I struggled to make the io-ts schema strict, not allowing unknown properties to be declared. I realized it was not possible and had to use a hack to make it work.

If we validate the following mappings

js
const mappings = {
  dynamic: true,
  date_detection: true,
  numeric_detection: true,
  dynamic_date_formats: ['abc'],
  _source: {
    enabled: true,
    includes: ['abc'],
    excludes: ['abc'],
    unknownParam: 'hello', // should fail as it's unknown
  },
  properties: { title: { type: 'text' } },
  dynamic_templates: [],
  unknown: 123, // should fail as it's unknown
};

To make it fail with io-ts we need to

// Declare a schema

const mappingsConfigurationSchema = t.exact(
  t.partial({
    dynamic: t.union([t.literal(true), t.literal(false), t.literal('strict')]),
    date_detection: t.boolean,
    numeric_detection: t.boolean,
    dynamic_date_formats: t.array(t.string),
    _source: t.exact(
      t.partial({
        enabled: t.boolean,
        includes: t.array(t.string),
        excludes: t.array(t.string),
      })
    ),
    _meta: t.UnknownRecord,
    _routing: t.interface({
      required: t.boolean,
    }),
  })
);

// Create an array for _each_ (!) object we don't want unknown properties declared

const mappingsConfigurationSchemaKeys = Object.keys(mappingsConfigurationSchema.type.props);
const sourceConfigurationSchemaKeys = Object.keys(
  mappingsConfigurationSchema.type.props._source.type.props
);

// Validate the schema
const result = mappingsConfigurationSchema.decode(mappingsConfiguration);
const isSchemavalid = isLeft(result) === false;

// Manually validate if there are unknown properties
const unknownConfigurationParameters = Object.keys(mappingsConfiguration).filter(
  key => mappingsConfigurationSchemaKeys.includes(key) === false
);

const unknownSourceConfigurationParameters =
  mappingsConfiguration._source !== undefined
    ? Object.keys(mappingsConfiguration._source).filter(
        key => sourceConfigurationSchemaKeys.includes(key) === false
      )
    : [];

// Check each unknown property array one by one
if (unknownConfigurationParameters.length > 0) {
  // .... some logic
}

if (unknownSourceConfigurationParameters.length > 0) {
  // ... some logic
}

If we compare with avj, this is how we would do it (if this looks like ES mappings declaration... it is pure coincidence ! πŸ˜„ )

import Avj from 'ajv';
const ajv = new Ajv();

// Declare a schema
const schema = {
  properties: {
    date_detection: { type: 'boolean' },
    numeric_detection: { type: 'boolean' },
    _source: {
      type: 'object',
      properties: {
        enabled: { type: 'boolean' },
        includes: { type: 'array', items: { type: 'string' } },
        excludes: { type: 'array', items: { type: 'string' } },
      },
      additionalProperties: false, // Don't allow unknown
    },
    // ...rest of schema
  },
  additionalProperties: false, // Don't allow unknown
};

// Validate
const valid = ajv.validate(schema, data);

// Grab a coffee as we are already done :)
if (!valid) {
  // ... some logic
}

avj goes much further when it comes to schema validation, I let you look at its features. The strength of io-ts lies in the fact that we can retrieve Typescript Types out of its schema, but in practice, we don't need that feature.

We do need a robust validation schema declaration (like Joi gave us) that runs in the browser and allows us to validate our JSON simply.

cc @cjcenizal @jloleysens

elasticmachine commented 4 years ago

Pinging @elastic/es-ui (Team:Elasticsearch UI)

jloleysens commented 4 years ago

@sebelga Thanks for creating this issue for discussion! I really appreciate it πŸ™‚. Some of my thoughts:

  1. I totally agree that io-ts is a more low-level library (helper libraries have been built on top of it like: https://github.com/gcanti/io-ts-types to provide more high level types). In that sense, it also doesn't ship with a ton of configurability. With io-ts one has atoms with no higher level settings like include/exclude types. Also I'm not a fan of their documentation.

  2. I actually think this may be a bug and have opened an issue against io-ts about this particular behaviour (allowing unknown properties in t.types or t.partial). io-ts claims to implement runtime types based on typescript. If const a: { b: number } = { b: 1, c: 2 } is invalid TS it should be invalid at runtime too. [UPDATE] It appears as though the current behaviour in io-ts is intentional since typescript implements structural type checks. There is an ongoing discussion at io-ts about how best to handle this.

  3. Are there any other behaviours we may want out of a schema validation library that you have identified? Besides disallowing unknown properties at runtime.

  4. What is the level of urgency here?

Overall I think we should go with the tool that doesn't incur too much tech debt in using it while not hampering productivity as you've illustrated. If ajv gives us 1, 2 and 3 I think we should add it as a Kibana dependency for sure and if 4 is high then sooner rather than later :).

[UPDATE]

One other thing to consider is that we do want to keep the list of dependencies as short/secure/clean as possible so auditing the dependencies of ajv would also be a good step to take!

sebelga commented 4 years ago

With io-ts one has atoms with no higher-level settings like include/exclude types. Also, I'm not a fan of their documentation.

Yes indeed io-ts is very low level and it seems that apart from defining types and interfaces it does not go further. If we ever need to validate that an array is not empty, a number is smaller than a threshold, a string does not start with a character, we would need to manually write those validations, which IMO should be the role of a validation lib.

About documentation: it is indeed very complex and poorly written for people not so familiar with advanced Typescript concepts and functional programming. We finally "get" it more by trial and error than understanding the docs.

Are there any other behaviors we may want out of a schema validation library

See my answer above but I could think of many. Like I said we not only need to validate that a value is a certain type, we should be able to validate that the value is also valid according to our business rules. My opinion is that we should have a tool that does not limit ourselves. Looking at AVJ doc it seems that string can be validated as ipv4, ipv6, uri, time ...

Also, I am proposing ajv, but I am opened to hear about other validation libraries.

we do want to keep the list of dependencies as short/secure/clean as possible

Great point, I also think we should keep the list of deps short. This is why the first thing I asked you when the Joi issue came up is: what is the recommendation from the Kibana op team? In the sense, what is the alternative (lib available in the repo) that we should be using from now on instead of Joi. I agree that, if we find ajv valuable for our use cases, that it works both in the browser and in the server, that the bundle size is correct, that it is fast, that it covers a wide range of validation scenarios, then we should advocate for it to the Kibana team and have all the teams use that library instead of each team finding the "best schema validation lib" 😊

To answer your question, it has 4 dependencies

What is the level of urgency here?

None as we have a fix (hack 😊 ) for the mappings editor validation in place. But as I said above, if we do find value in ajv and want to start using it, then we should try to make it the default (recommended) lib for schema validation in the Kibana repo.

rudolf commented 3 years ago

I think the closest we have to a default validation library is @kbn/config-schema. We are aware that it has several short-comings like no documentation, no browser support and bad performance.

There seems to be better typescript support coming for AJV https://github.com/ajv-validator/ajv/issues/736#issuecomment-692264999

One benefit of io-ts is that so many teams are already using it, so before we would make a recommendation for a new default we also need to weight up the effort to convert any existing validation vs the potential benefits.