hapijs / joi

The most powerful data validation library for JS
Other
20.89k stars 1.51k forks source link

RFC: Joi plugins/extensions/custom functions #577

Closed Marsup closed 8 years ago

Marsup commented 9 years ago

See https://gist.github.com/Marsup/14597d0c8eaa10c4addb for latest version of the RFC.

totherik commented 9 years ago

This looks like a good start, but one major issue is that it's using what is effectively a global in an unsafe way, potentially leading to naming collisions and unexpected behavior.

For example, imagine if module A and module B both depend on the same version of Joi and both declare a type Foo. The module will be deduped and they'll both reference the same instance of the Joi module, meaning they now compete for the same name, Foo, in the same namespace. This isn't a terribly unrealistic scenario and the collision would be very difficult to debug.

Marsup commented 9 years ago

As written, any attempt to override a type will result in an error, I don't think this should be a problem to debug.

totherik commented 9 years ago

Ah, ok, I missed that part.

If I depend on A and B and an error is thrown, I have no recourse or ability to remedy the situation as I don't own those dependencies and those authors could not have anticipated being I the same dependency graph as each other.

This would require module authors to define globally unique (as in unique from every other package in npm) types.

Also, as a module author I can say that having to come up with and define globally unique names via trial and error is undesirable .

Marsup commented 9 years ago

I'm not sure it's preferable to locally load all extensions. Uniqueness will be partially guaranteed by npm naming, for other cases I don't see how we can solve that.

totherik commented 9 years ago

Uniqueness will be partially guaranteed by npm naming

Can you clarify?

Also, I understand I'm the worst kind of commenter. I identify a problem but, unfortunately I don't have a solution. (Need to think on it more.) Sorry about that.

I can say that at my company where we would need to define types for validation in things like service wrappers, this mechanism would not work and we would not be able to use it due to the high likelihood of name reuse.

Marsup commented 9 years ago

If people publish types as modules named joi-type, that should avoid many conflicts, but that doesn't fix the internal ones. So you'd rather require all extensions each time in all your files?

AdriVanHoudt commented 9 years ago
  1. I really like the spec and how it opens up Joi for custom stuff
  2. Maybe add some docs for the handlebars in language where you reference parameters?
  3. Will Joi itself now use this system? Because you can basically rebuild Joi with this right?
Marsup commented 9 years ago

I don't believe the templating language is relevant for this proposal but yes it'll have to be documented. Joi need to maintain a bunch of core features because of parameters validation.

rlidwka commented 9 years ago
  1. Maybe rename Joi.addType to Joi.extend for consistency reasons?
  2. What the difference between convert function Joi.addType and validate function in Joi.<type>.extend? As far as I see, both can change value, and both can throw on error.
  3. I don't see a way to access arguments passed to a type. In your example, Joi.user(options), options seem to be inaccessible.
Marsup commented 9 years ago
  1. Yeah why not, based on the options given, it should work. The Joi object is an Any currently though, you'd be extending it, I'll have to see if it's feasible.
  2. The convert has a chance to do something before anything else does, this is a post-base check if you will. I've thought of this because of an issue about mongodb's ObjectId: if you need to coerce your value to a specific type (in this case new ObjectId(value)) for any other extension method to work, you have no place to put it.
  3. There are no options passed to a type, what do you mean ? The options used to build a type or a method are not the concern of the user, hence not exposed, should it be ?
myndzi commented 9 years ago

Really excited about this, and bonus: asynchronicity!

I think maybe you misunderstand the namespace problem brought up by totherik. He's not talking about uniqueness as an NPM module, he's talking about uniqueness in the Joi namespace. I agree, this is a problem. It can be solved fairly readily, however, by allowing Joi to be instantiated rather than using it as a singleton, e.g. var Joi = require('joi')(options?);

It would then be up to the developer to share the Joi instance around, but this is trivial in Node.js since you can just create a simple wrapper that also defines your extensions:

var Joi = require('joi')();
Joi.extend(etc...)
module.exports = Joi;

You can drop the requirement for the () by making it instead a method of Joi itself, like Joi.giveMeANewDefaultInstance() (can't think of a good name!)

Joi has long been my favored validation library, but I've wound up using it almost nowhere because of the two things this proposal will address. Happy happy joy joy!

myndzi commented 9 years ago

Re: the question about cloning/freezing or trusting the developer, I'm in favor of the latter. Solid enforcement appeals to me, but one of the most useful use cases for Joi is in things like HTTP requests, where the performance difference can be solidly beneficial. A suitable compromise is to allow the user to specify an option about whether to freeze the objects or not, and this option could be used when running tests or development, for example, but not in production.

totherik commented 9 years ago

Thanks for clarifying my comments @myndzi. (Weekend away with the fam so trying to stay offline :) )

So with the pursuit of types-as-npm-modules, is it reasonable to expect that a single application/module may need to depend on incompatible versions of the same npm module type definition? If so, npm can't be used to manage those dependencies. (I'm thinking major revisions of types that aren't required to be backward compatible according to semver.)

I don't know all use-cases, but I have to imagine this could be an issue.

Marsup commented 9 years ago

I understood what he meant, npm is indeed a "trick", not a solution, but it might avoid some basic conflicts. Having multiple versions of your models is indeed a problem (joi or not), but isn't it better to have an error thrown to tell you about this mistake rather than having 2 separate parts of your code supposedly manipulating the same thing and have different versions of it ? If it is an npm module, it's probably wiser to have a single version in your dependency tree, and if it's part of your code, maybe you'll have to make this explicit, like rename type user into v1user and create a v2user if you create a new version of your API. What do you think ?

I kind of agree on the freezing/cloning/vanilla problem, if you really want to mess up your schemas, that's your responsibility, can't protect users from everything.

myndzi commented 9 years ago

I don't see how the name of the NPM module affects the name of the thing extending Joi in any way, really. Validation is useful for lots of things, and the same name can mean different things in different contexts. Namespacing is a troublesome solution, but extending separate instances seems like a clean way around it...

I'm not sure you do understand it, since throwing an error is out of the question. It's not a mistake so much as a circumstance that can arise as a result of two separate modules using Joi for validation. It's not about describing a model and manipulating it from two angles, it's about the fact that node.js modules are inherently singletons, so you run into namespace problems when you use multiple distinct modules that utilize joi for validation...

myndzi commented 9 years ago

Let's take your mongo example as a base. It extends Joi with a type called 'user'. Say you add in something like Passport, which has also utilized Joi for validation and has a 'user' type of its own with a different meaning. You, as the developer, can do nothing about the fact that you're using two modules that want the 'user' namespace, but only one is allowed to have it, unless you do evil things with the path structure which leads into undefined behavior. It's not an error that the mongo module is defining 'user', and it's not an error that passport is defining 'user'. Throwing doesn't solve it, and the developer has no tools to solve it either. Allowing the modules to utilize their own individual instance of Joi, however, cleans this kind of scenario up nicely :)

AdriVanHoudt commented 9 years ago

@myndzi This can easily be solved by the authors of the modules by choosing more unique names like passport-user but it is indeed a concern that you may run into duplicates which leads to not being able to combine certain modules if the authors don't want to change their names

Marsup commented 9 years ago

I perfectly understood the problem thanks. I'm trying to find alternatives to that, because what you're proposing doesn't make sense to me. If you inspect your API, you will find several validations with the same type but not the same underlying concept, that bothers me.

I have not made this proposal just to offer my own version of custom functions, which would be much simpler if you look at previous PRs about it. Joi is also built for documentation through code, I intend to keep it that way.

Now we could introduce an optional namespace property that would force you to do Joi.myModule.myType but I'm not even sure a single level will please everyone. I can't think of any way to have what I want without some kind of uniqueness.

AdriVanHoudt commented 9 years ago

Any thoughts on alias support? Joi.alias(<type>, <otherType>) e.g. Joi.alias('valid', 'only')

Marsup commented 9 years ago

What do you think if addType just returned a new type and let you attach it to Joi if you really want it ? Like Joi.user = Joi.createType({ ... }) (createType makes more sense considering the behavior).

AdriVanHoudt commented 9 years ago

So a module would create a Joi type and export it, so you can add specific types to your custom Joi?

Something like this?

var mongoUser = require('joi-mongo-user');
var passportUser = require('joi-passport').user;
var Joi = require('joi');

// either this
Joi.user = mongoUser;
Joi.passportUser = passportUser;

// or this
Joi.user = passportUser;
Joi.mongoUser = mongoUser;

module.exports = Joi;

joi-mongo-user

module.exports = Joi.createType({...});

joi-passport

module.exports = {
    user: Joi.createType({...}),
    otherKey: Joi.createType({...})
}

I like it.

Although I still like the idea of Joi.object().isUser() since you can clearly see the base type.

Marsup commented 9 years ago

I've updated the proposal, see the diff in https://gist.github.com/Marsup/14597d0c8eaa10c4addb/revisions. Does it address your concerns @myndzi and @totherik ?

AdriVanHoudt commented 9 years ago

So Joi.extend() will not extend the singleton Joi but return a 'new version' of it right? Whereas the Joi.type.extend() can do both(taken from the examples)? As for the rest I like it!

nlf commented 9 years ago

:+1: I like the idea of returning a new singleton rather than mutating the core, makes things much more pluggable and keeps up with the concept of immutability within joi. Good idea!

AdriVanHoudt commented 9 years ago

I agree with @nlf but some examples look otherwise, that's why i mentioned it

myndzi commented 9 years ago

If I understand this correctly, it's similar to what I had in mind but more fine-grained. Where my suggestion would group all extended types (by a single module) under a single instance of Joi local to the module that required it, this proposal would instead create multiple individual instances of Joi, one for each type you create. This works acceptably, but doesn't make as much organizational sense to me. Am I missing something?

In other words, I am thinking of something like this:

Joi
- Joi-Passport
  - user
  - oauthprovider (or whatever)
- Joi-mongo
  - objectid

whereas my understanding of the new proposal is more like this:

Joi
- Joi-Passport-User
  - user
- Joi-Passport-Oauthprovider
  - oauthprovider (or whatever)
- Joi-mongo
  - objectid
Marsup commented 9 years ago

You can just as well organize your own composition of types :

var JoiPassportUser = require('joi-passport-user');
var JoiPassportOauthprovider = require('joi-passport-oauthprovider');
var JoiMongo = require('joi-mongo');

module.exports = Joi.extend([
  { name: 'passportUser', base: JoiPassportUser },
  { name: 'passportOauthProvider', base: JoiPassportOauthprovider },
  { name: 'mongo', base: JoiMongo }
]);

You can do that in a single pass like this (which will probably be more efficient), or with several extends.

Maybe what's missing is the ability to include a Joi instance into another one, but that might prove to be a challenge.

myndzi commented 9 years ago

var user = Joi.string().min(1).required().extend({ ... }) feels a little clunky, I think this is why:

There are two fundamental concepts at work here, a "type" and a "rule". While it makes sense to provide a way to avoid repeating work, it doesn't quite make sense to define a new type ("user") as an extension of an existing base type, or as the result of some specified rules. What you're wanting to do here, I think, is avoid repeating the work of validating the value as a string, but this feels more at home in the validation function itself, something like:

var user = Joi.createType({
    validate: Joi.string().min(1).required(), // a joi schema or a function?
    coerce: function (value) {
         // coercion failure is also a validation failure
        return Users.findOne({ id: value });
    }
});

Coercion applies to types but not rules, and rules are specific to a given type; if we are creating a validation for a User object, it is liable to be something that needs coercing from something like an ID, and also something for which the rules for 'string' no longer make any sense to apply to.

I am currently imagining something like this:

var hasRole = Joi.createRule({
    description: 'check if a user has a role',
    parameters: {
        role: Joi.string().required(),
        db: Joi.object().required()
    },
    language: {
        database: '!!failed to verify the user because of a database error: {{err}}',
        role: '!!user {{user}} does not match role {{role}}'
    },
    validate: function (user) {
        // db stuff, i don't know mongo well
    }
});

var user = Joi.createType({
    validate: Joi.string().min(1).required(), // a joi schema or a function?
    coerce: function (value) { return Users.findOne({ id: value }); },
    rules: {
        hasRole: hasRole // it gets named here, not by a 'name' key
    }
});

module.exports = Joi.extend({
    user: user
});

In createType, 'validate' would be a validation applied before coercion; it would have little meaning after coercion -- the purpose of coercion, here, is to put the value into a form the rules can be applied to, whereas the purpose of validation is to ensure that the value is acceptable for the base type to begin with. Putting 'rule's as an object in createType allows you to inherit rules if you should so desire, for example:

var SuperString = Joi.createType({
    validate: ...,
    coerce: ...,
    rules: Joi.string.extend({
        myCoolValidator: function () { ... }
    })
});

module.exports = Joi.extend({
    string: SuperString
});

The object syntax also supports 'extend' well both semantically and implementation-wise

myndzi commented 9 years ago

I think we're converging on the same ideas ;)

I honestly see very little use for exporting anything other than a whole-and-completely-usable Joi object, but reusability is always helpful. To me, if I need to perform some kind of custom validation, it's for the purpose of argument-checking a function I am also writing. I'm having a hard time coming up with a scenario where I'd want to export a Joi schema rather than the function it's intended to do work for. Thoughts?

AdriVanHoudt commented 9 years ago

I like the ideas in your proposal @myndzi The benefits for exporting only the rule/type is then you can make 1 place where you wire them all up without having to extend every previous Joi object from the previous module

myndzi commented 9 years ago

What I mean is that Joi isn't likely to be layered in that fashion. I can't think of a case where I'd have Joi layered three deep as you suggest. If I require a module that uses Joi, I'm not requiring it to gain access to new Joi validation types -- I'm requiring it to gain access to whatever that module is supposed to do. If it uses Joi to validate things, great! But those are things I am not responsible for validating, they are up to this module I'm using. So why would I want this module's copy of Joi at all, let alone want to extend it?

Marsup commented 9 years ago

Looks like we're almost back to my draft with base type and convert. :)

I don't really see the benefit of createType if extend does the same thing, it's just an unnecessary intermediate object while you could use the POJO with extend directly.

You seem to have multiple forms of extend (on Joi and on type), care to explain the difference ? If the 2nd form returns a type as I guess it is, I don't think it's gonna be easy to just put that into rules and merge it with the validate part. Your last example would be better written like this :

var SuperString = Joi.createType({
    validate: Joi.string(),
    coerce: ...,
    rules: {
        myCoolValidator: function () { ... }
    }
});

I'm trying to minimize the number of apis you'll have to use, this seems a bit more complicated than I'd want it to, but yes I think we're converging.

myndzi commented 9 years ago

Well, the point of having a separate function is mostly so you don't have to write one huge monster function with nested objects many ways deep. It also gives you points to break things up into files; if the actual product was as straightforward as I wrote, I probably wouldn't have broken it up either.

Extend in every case should be interpreted to mean "return a new object that's a copy of this object, and assign these keys to it". You're right, my example is poorly written; what it really was meant to represent was Joi.string.rules.extend(), as in, "give me a copy of this type's rules" -- but the "rules" component seemed superfluous. The main point was to separate the rules from the type -- what I'm interested is in the rules that can be applied to a string; what I'm not interested is making the underlying value into a string or requiring it to be. Clean separation is what I was going for.

I do prefer the original form of your proposal, felt it made a clearer distinction between a type and a rule, and that it's a worthwhile distinction. All it needed was the 'extend' idea, a way of instantiating a sub-copy of Joi instead of affecting the global singleton.

Edit: Re: 'validate' vs 'base type', I think that 'base type' implies inheritance in a place that inheritance doesn't belong, but I was trying to avoid the necessity of something like:

function (value) {
    var errors = Joi.validate(Joi.string().min(1), value);
    if (errors) { throw errors; }
    // other stuff
}

... so then I came back to the idea of supplying a Joi schema to validate against as an optional alternative. The equivalence totally flew right past me ;)

Coercion vs conversion is just another choice according to intent; coercion implies "same value but in another format" whereas conversion doesn't carry that connotation as strongly.

raisch commented 9 years ago

As an aside, I just published joi-extender as an experiment in adding new 'top-level' types to joi. While it's clearly marked as EXPERIMENTAL, I'm using it in several projects and only mention it here for purposes of discussion.

tsunammis commented 9 years ago

@Marsup Do you have a branche with your current WIP about this RFC ?

Marsup commented 9 years ago

@tsunammis nope, the RFC is not final yet, I won't start implementing before that, this is a big refactoring ahead and I don't want to go back and forth.

tsunammis commented 9 years ago

@Marsup indeed, you have right to fix the RFC before implementing it ;-)

fhemberger commented 9 years ago

Just a short remark/question, as I just stumbled upon this issue:

Joi operates on the existing JavaScript data types (string, object, array, boolean, etc). I guess for the majority of cases you don't want to add new data types but instead add further checking patterns/functions to the existing data types, hence createType is a bit misleading.

Hapi offers the server.decorate() method, so how about adapting this for Joi as well? For Example:

Joi.decorate('string', 'usZipCode', function (value) {
    // perform check
});

I think it would be more expressive that way.

AdriVanHoudt commented 9 years ago

@fhemberger how would you cover the use case of decorating something like string().min(0)?

fhemberger commented 9 years ago

@AdriVanHoudt Hmm, I don't know. I would only extend the data types to add new assertions, and chain them with existing methods, e.g.: Joi.string().length(5).usZipCode().

That makes the validation chain more explicit than adding an arbitrary number of custom functions, which could wrap an arbitrary number of other Joi functions each.

AdriVanHoudt commented 9 years ago

true but you want to prevent stuff like: usZipCode() requires a string with 5 chars so you add a test in the rule for that but when you do .length(5).usZipCode() you're checking twice because I don't know whether usZipCode() checks for the length or not

fhemberger commented 9 years ago

But the same is true for nested custom functions. You have to look inside as well to see what they check.

AdriVanHoudt commented 9 years ago

hmm true, a case can be made that the custom rule should be smart enough to check (in this example) that the string is 5 long. The question is then do you want to extend string() or string().length(5)? since the later is easier for e dev to make custom rules

myndzi commented 9 years ago

A zip code, while basically a string, is a different "kind" than a string. It's its own entity that has its own rules. Ideally, for me, you would make a US Zip Code "type". This type could be coerced from a string or number and would throw if the input value is not a valid US Zip Code (this includes length). Rules, distinct from types, are additional constraints beyond the "kind" of thing; for example, a Zip code type might have a rule that restricts it to be within a certain state or set of states.

The distinction between "type" and "rule" is important here; "types" come with inherent validations and "rules" apply additional constraints to reduce the value from "anything that's a valid type" to "only certain valid types"

The type itself would probably have to inherit from any(), since string or number based rules don't have meaning against a zip code.

AdriVanHoudt commented 9 years ago

The distinction is indeed important but I do not agree with you. A zipcode is a String so it should use String() as the base just like email() and ip() for example

myndzi commented 9 years ago

A zipcode might be stored as a string, but it is not a "string" in the sense of the rules that apply to strings. Consider the String class rules:

Insensitive: meaningless to a zip code Min: meaningless to a zip code Max: meaningless to a zip code Length: meaningless to a zip code Regex: nominally meaningless to a zip code Alphanum: meaningless to a zip code Token: meaningless to a zip code Hex: meaningless to a zip code Uppercase: meaningless to a zip code Lowercase: meaningless to a zip code Trim: meaningless to a zip code CreditCard, email, ip, url, guid, hostname: Meaningless to a zip code; should probably be their own types rather than rules against String

The only reason you would inherit the type from a String is to gain the above rules. Since none of them apply, it makes very little sense to inherit from String. In fact, if we were going by the current canon, a zipCode method would instead be a rule applied to String, not a type of its own; however, this would preclude the ability to target a zipCode type with specific rules if such was your desire, and I'd prefer instead to see the pseudo-types above become their own types instead: They also have potential for rules that target their types specifically:

CreditCard types could have a rule that matches based on provider types IP types could have a rule that matches based on IP class, or type (local/non-routable, reserved, etc.) Hostnames could have a rule that matches based on whether they are a root public suffix

etc...

P.S. - You can note that the same question that arose above about string length + zip code rules presently exists already in the case of the pseudo-types mentioned. You could apply .length(X).guid() or .alphanum().ip() and so on, or even create completely invalid validations such as .alphanum().hostname() - this serves only to illustrate that while your thinking matches the current approach, it is worth considering changing. Should not all rules on a given type be compatible with each other? (excluding deliberately obtuse usage of the rule arguments, such as .min(5).max(4))

AdriVanHoudt commented 9 years ago

The way Joi works is that the types are the base types of JavaScript. If you represent a zipcode in Js it's probably a string (or number here in Belgium). So it makes sense to say that a zipcode should be a string that represents a zipcode right? Saying that length and trim etc. are meaningless is not correct. For example a zipcode in Belgium is always 4 numbers and if you represent it with a string why not be sure and trim the string? If you want specific 'rules' on rules like ip you have the ability to pass options into the rule. For example you can specify if an ip should be ipv4 or ipv6

myndzi commented 9 years ago

Presumably, as part of being its own type, the zipcode would be coerced; this process would include trimming the string if desired. Our original example was a specific kind of zip code, for which the length was an inherent property; this is the context of my example. A US zip code is inherently 5 digits (leaving aside the extended zip code for the moment). The discussion was about whether this inherent property should be implicit or explicit.

Should you want to validate the length of a zip code, it also makes more sense that you would implicitly validate the length by explicitly specifying the locality of the zip code. That is, you wouldn't use ZipCode().length(4) but ZipCode().country('Belgium') or some such.

Your point about Joi's current base types being 1:1 with Javascript's is noted; I think that a more deliberate and useful organization is possible, which is why I commented about it.

AdriVanHoudt commented 9 years ago

Just thought of this. If you really want to you can just extent any() with usZipCode(). This allows you to do just Joi.usZipCode() and solves your problem.

myndzi commented 9 years ago

That is indeed what I suggested originally :)