RFC: Joi plugins/extensions/custom functions

Marsup commented 9 years ago

See https://gist.github.com/Marsup/14597d0c8eaa10c4addb for latest version of the RFC.

kfitzgerald commented 9 years ago

Greetings,

I've been creeping on this thread since February, and perhaps zip-code is a controversial use-case for this overall functionality, since on the surface, it's seemingly a relatively easy case to work around using regex. However, @myndzi hinted at the real value of what this functionality would provide. The context or implicit value would need to be validated. The string itself is useless unless the zip code is an actual registered zipcode, or contextually checking the country to which the code should must belong.

My particular use-case involves validating identifiers, specifically 12-byte base58 or base62 encoded strings with a 2-character prefix (think mongodb identifiers made smaller and less ugly). Sure, Joi.string().alphanum() works fine as a preliminary check if the string is potentially valid, but the real need is to actually strip the prefix, decode the identifier and check if the total byte count is actually valid. Since the various encodings could result in inconsistent string lengths, this would require a custom validation mechanism, applicable to this thread.

Base58-encoded identifier: PR2c4Exzy7X6cikEDmp   <-- this is what gets validated
ObjectId: 55054501fdce2c7e5f58e56f
Decoded integer: 26312596343935070124784543087

Whether the end result looks something like Joi.objectId().base58().prefix(2) or Joi.string().alphanum().objectId().base58().prefix(2), doesn't particularly matter to me.

raisch commented 9 years ago

I agree with those who are calling for the addition of new "intrinsic" types to Joi.

The zip-code use case is a good one and while I understand the interest to keep a one-to-one correspondence between Joi and JavaScript intrinsics, I believe that, for Joi to fulfill its stated purpose (to be a useful data validation suite), it should provide an API flexible enough to both adequately and succinctly represent its users' needs.

Another example of a use case is validation of Designated Marketing Areas (DMA) which are represented by strings representing integer values between 500 and 799. The list of valid DMAs is sparse, in that not all values within the range are legal.

While it is indeed possible to validate a DMA using joi.string().regex(), doing so requires a large and very specialized regex, which if not constructed properly, will fail to validate correctly.

I can envision many such use cases and so strongly believe that validation of such specialized forms should have specialized validators, if only to allow developers to write concise, understandable code.

Rob Raisch, Internet Handyman

On Tue, Apr 14, 2015 at 11:38 AM, Kevin M Fitzgerald < notifications@github.com> wrote:

Greetings,

I've been creeping on this thread since February, and perhaps zip-code is a controversial use-case for this overall functionality, since on the surface, it's seemingly a relatively easy case to work around using regex. However, @myndzi https://github.com/myndzi hinted at the real value of what this functionality would provide. The context or implicit value would need to be validated. The string itself is useless unless the zip code is an actual registered zipcode, or contextually checking the country to which the code should must belong.

My particular use-case involves validating identifiers, specifically 12-byte base58 or base62 encoded strings with a 2-character prefix (think mongodb identifiers made smaller and less ugly). Sure, Joi.string().alphanum() works fine as a preliminary check if the string is potentially valid, but the real need is to actually strip the prefix, decode the identifier and check if the total byte count is actually valid. Since the various encodings could result in inconsistent string lengths, this would require a custom validation mechanism, applicable to this thread.

Base58-encoded identifier: PR2c4Exzy7X6cikEDmp <-- this is what gets validated ObjectId: 55054501fdce2c7e5f58e56f Decoded integer: 26312596343935070124784543087

Whether the end result looks something like Joi.objectId().base58().prefix(2) or Joi.string().alphanum().objectId().base58().prefix(2), doesn't particularly matter to me.

— Reply to this email directly or view it on GitHub https://github.com/hapijs/joi/issues/577#issuecomment-92915700.

myndzi commented 9 years ago

A more general way to put it is this:

Input format -> representative format -> presentation format

I'm a firm subscriber to this flow for data; all calculation, storage, etc. should be done in some normalized, canonical representative format. There's probably a term for this approach but I'm unfamiliar with it. The core difference between a zip code as a type and a zip code as a rule is that one stores the meaningful information in its representative format and one stores it in its input format. (Nevermind that in this case they both are likely to be the same thing ;)

Particularly for validation, and especially in the case of web applications, where coercion is commonly needed when dealing with query string parameters and form submissions, it is both useful and important to convert the input data as soon as possible and keep it in its most useful format for as long as possible. You want to be applying rules to the normalized/representative form of some meaningful piece of data, not the transport/input form.

gergoerdosi commented 9 years ago

A Promise : this is the only acceptable way to handle asynchronicity.

What's the reason for this? I'm just wondering because hapi modules (under the hapijs organization) don't use promises. Why callbacks won't be supported?

Marsup commented 9 years ago

I want to avoid parsing the function declaration.

gergoerdosi commented 9 years ago

What needs to be parsed?

Marsup commented 9 years ago

The fact that you need to be async.

Marsup commented 9 years ago

This got me thinking, since a promise cannot be aborted easily, I might have to know upfront that it's asynchronous, leaving me no choice but to have a callback, well that sucks...

myndzi commented 9 years ago

I don't quite follow. If you want to abort a promise, you can just throw an error? You still must return it, so all you really need to do is check whether the return value is an object with a function 'then' property. Still, I much favor support for callbacks -- and I write all my async code with promises -- simply because it's kind of the baseline requirement for Node code.

Declaring up front when you define it that your validation function is async isn't too onerous, and beats the alternatives...

DavidTPate commented 9 years ago

@Marsup When would you need to abort a promise? Wouldn't it just be a rejection instead?

As for canceling promises Bluebird provides a way to do it, but for things that are truly async they can't really be cancelled. This functionality is still in draft form for the Promises A+ spec (there's a few different draft proposals right now).

Marsup commented 9 years ago

I'd need to abort if called in the synchronous mode, that would cause unnecessary work. Aborting wouldn't even be enough unless the promise starts on next tick.

myndzi commented 9 years ago

I don't really think it's Joi's responsibility to ensure the user doesn't do something stupid like call an asynchronous function synchronously. Just let it fly. Possibly print a warning that a promise was returned from a validation function but validate was called synchronously.

DavidTPate commented 9 years ago

What about removing the ability to do validation in a strictly synchronous manner? If you wanted to support callbacks or promises for calls to Joi.validate() then when there is a callback you could use it, otherwise you could return a promise from Joi.validate() if there isn't a callback.

It's difficult to handle things that can be both async and sync in the same interface and I would expect it to cause additional need for help since calls to Joi.validate() could be made in both a sync and async manner.

raisch commented 9 years ago

Agreed. It's not the library's responsibility to enforce/dictate caller behavior. Garbage in, garbage out.

Perhaps async could be supported using two new chainables:

joi.asPromise() - returns a promise that resolves with result.value or rejects with result.error.

joi.onComplete(cb) - calls cb(result.error, result.value) when the current validation is complete.

asPromise() can either appear as the final chainable or if it occurs (possibly multiple times) within a chain, sets a flag that is used to mutate the final result.

onComplete() can appear any number of times in a chain and is invoked in-situ allowing for multiple callbacks based on the currently validated result.

/rr

Rob Raisch, Internet Handyman

On Apr 22, 2015, at 19:11, Kris Reeves notifications@github.com wrote:

I don't really think it's Joi's responsibility to ensure the user doesn't do something stupid like call an asynchronous function synchronously. Just let it fly. Possibly print a warning that a promise was returned from a validation function but validate was called synchronously.

— Reply to this email directly or view it on GitHub.

raisch commented 9 years ago

Actually and on further thought, I believe onComplete() makes sense as a chainable but asPromise should be an option to joi.validate()

/rr

Rob Raisch, Internet Handyman

On Apr 23, 2015, at 11:06, Rob Raisch raisch@gmail.com wrote:

Agreed. It's not the library's responsibility to enforce/dictate caller behavior. Garbage in, garbage out.

Perhaps async could be supported using two new chainables:
joi.asPromise() - returns a promise that resolves with result.value or rejects with result.error.

joi.onComplete(cb) - calls cb(result.error, result.value) when the current validation is complete.
asPromise() can either appear as the final chainable or if it occurs (possibly multiple times) within a chain, sets a flag that is used to mutate the final result.

onComplete() can appear any number of times in a chain and is invoked in-situ allowing for multiple callbacks based on the currently validated result.

/rr

Rob Raisch, Internet Handyman

On Apr 22, 2015, at 19:11, Kris Reeves notifications@github.com wrote:

I don't really think it's Joi's responsibility to ensure the user doesn't do something stupid like call an asynchronous function synchronously. Just let it fly. Possibly print a warning that a promise was returned from a validation function but validate was called synchronously.

— Reply to this email directly or view it on GitHub.

Marsup commented 9 years ago

What about removing the ability to do validation in a strictly synchronous manner?

Never gonna happen.

gergoerdosi commented 9 years ago

@Marsup: When are you planning to start the work on this? So many people ask for this feature. I think we should start working on the implementation even if we don't get it right the first time. We can always make refinements in later versions. We need this feature too, so I'm happy to help.

Marsup commented 9 years ago

There are clear alternatives to the lack of this feature, at least in hapi, so I don't want to rush that feature. Sorry but I'm really swamped at work at the moment and can only get so much done on my free time. I've tried to integrate the last few discussions to the spec but it's still an unfinished draft, if you have time you can help out finalizing it.

gergoerdosi commented 9 years ago

No problem, I can totally understand it. What is an alternative to an async validation on the payload for a given route? Server methods?

Marsup commented 9 years ago

All the validations support declaring a function. In the meantime I'd still use joi schemas to validate basic models features in that function, then do the async things.

Something like :

validate: {
  payload: function (value, options, next) {
    var r = Joi.validate(value, schema, options);
    if (r.errors) {
      return next(r.errors);
    }
    doSomethingAsync(r.value, next); // re-use the joi mutated r.value for type casts
  }
}

gergoerdosi commented 9 years ago

Got it, thanks! We can live with that for now.

Marsup commented 9 years ago

OK, it took some time but I've updated the RFC (see 1st post) with something that I think will be simpler and more powerful, hopefully I got it right this time. Comments ?

nlf commented 9 years ago

this RFC looks good to me, i feel like you've covered all the use cases i can think of. my only concern would be making sure the behavior of assert with async validators isn't ambiguous.

simon-p-r commented 9 years ago

Looks good, can base be a object or just a key and value?

Marsup commented 9 years ago

@nlf it's mentioned that any attempt to use async with sync functions (assert & validate w/o cb) will throw. @simon-p-r a joi object necessarily, any being the most naked base you can have, why ?

simon-p-r commented 9 years ago

I have polymorphic json schemas at present which use async routines as part of validation that I want to migrate to joi.

marshall007 commented 9 years ago

@Marsup this looks really great!

Just to clarify, is the order of operations base -> pre -> rules or pre -> base -> rules? Based on the user example, it appears to be the former. Will there be a way to run custom type conversions before the value is validated against the base schema?

Marsup commented 9 years ago

Your guess is correct. What would be the use case to go before the base ?

marshall007 commented 9 years ago

@Marsup if, for example, you wanted to implement a custom string to Date coercion, it would allow you to convert the string to a Date yourself before the base Joi.date() validator fails.

Marsup commented 9 years ago

Joi.date() should be fixed rather than adding a hook for that, unless you can think of other use cases.

marshall007 commented 9 years ago

The real use case I had in mind is a bit more involved than just a string conversion. I was hoping to convert objects that look like this: {'$reql_type$': 'TIME', epoch_time: 1376075362.662, timezone: '+00:00'} into Date objects and vice versa.

Marsup commented 9 years ago

So you basically need joi.date() functions on a joi.object() base ?

marshall007 commented 9 years ago

@Marsup pretty much, yea. The value I'm validating might already be a Date though, in which case we don't need to convert first.

nlf commented 9 years ago

for that use case couldn't you just use the pre to convert the object to a date and call the standard date validator in your validate method? i don't think it's worth the added confusion of having two places to preprocess your input

Marsup commented 9 years ago

Then the base could be an alternative with 2 possibilities. I'll agree that complicates it a bit but it's the best way to auto-document it.

AdriVanHoudt commented 9 years ago

Any progress on this? I actually encountered a use case where this would be very handy for us

Marsup commented 9 years ago

Nope, still twisting my mind about that double inheritance, I don't see it working in any possible way.

AdriVanHoudt commented 9 years ago

double inheritance?

jfhbrook commented 8 years ago

A few things:

1) I need this and it really hurts that this hasn't gone anywhere. 2) I would need to define multiple custom types and the extend syntax as proposed doesn't seem to meet this use case. 3) Double inheritance?

Marsup commented 8 years ago

1) I know, me too, until some company hires me to do just that, I have a paying job and a life, so, sorry... 2) There can be multiple types obviously, that's the point 3) The case described up there, where a property can be either a date or an object representing a date

jfhbrook commented 8 years ago

Walmart doesn't sponsor dev anymore? Bummer.

jfhbrook commented 8 years ago

Also: What would specifying 2 new types in an extend call look like?

Marsup commented 8 years ago

You specify the type on each extend, so 2 calls, I know it's lacking complex examples but I thought this one was clear.

jfhbrook commented 8 years ago

So given that .extend returns a wholly new instance, that seems odd. Less odd, of course, if that modifies the singleton. edit: Joi.extend(myFirstType).extend(mySecondType) isn't the worst thing though.

Marsup commented 8 years ago

It's keeping the immutable philosophy of joi, that way you're guaranteed that your version of joi cannot suffer from any side-effect of a 3rd party library that would also be using joi.

jfhbrook commented 8 years ago

I'm with it in terms of returning a new instance, I think that's by far the safest way to go. It just seems weird to create 2 instances of it when you only need one.

Marsup commented 8 years ago

There would be a single instance if you called extend with an array, that's documented.

amcdnl commented 8 years ago

@Marsup is there any updates on this?

Marsup commented 8 years ago

I have a local proof of concept with sync extensions but no unit tests. Async will take a lot more time I think, maybe I could beta release what I have already if there's interest.

amcdnl commented 8 years ago

Ya, would love to give it for a spin ( as others I'm sure ).

hapijs / joi

RFC: Joi plugins/extensions/custom functions #577