tgriesser / checkit

simple, flexible validations for node and the browser
MIT License
223 stars 53 forks source link

Catching all errors in validators is dangerous #13

Open jclem opened 10 years ago

jclem commented 10 years ago

The following seems like a reasonable way to check for uniqueness of email for a given collection:

// 'unique:users:email'
Checkit.Validators.unique = function(val, collection, field) {
  collection = require('./collections/' + collection);
  var query  = {};
  query[field] = val;

  collection.fetch(query).then(function(resp) {
    if (resp.length > 0) {
      throw new Error('error message');
    }
  });
};

However, since Checkit considers any error type to be a failed validation, by the time I'm handling the validation failures for a model save, there's no way to differentiate between errors thrown by programmer error, database connection issues, or anything other than the "expected" failed validation.

In the example above, simply mis-typing the collection name ('unique:uers:email') might result in something like this...

model.save().then(function(model) {
  res.json(model);
}).caught(Checkit.Error, function(err) {
  res.statusCode = 422;
  res.json(err.toJSON());
});

...returning errors like { "email": ["Cannot find module './collection/uers'"] }. Worse if one accidentally does something like this...

process.env.MY_SECRET_SOMETHING.slce(1); // I meant to type "slice"!

...which might return { "email": ["Object my-secret-something-value has no method 'slce'"] }.

Unless I'm missing something, there's currently not a way around this. In light of this, I think it makes sense for Checkit to throw errors that are of anything other than a specific type (maybe Checkit.ValidationError) separately.

tgriesser commented 10 years ago

Hey @jclem, thanks for pointing this out.

The error handling semantics around the library is something I'm still a bit unsure about and looking to nail down before I try to publicize the library, just haven't had the time to go through it critically and ensure that all things make sense, so issues like this definitely help to nail down.

So, there is sort of a way around it (though a bit obscure)... If you wanted, you could change the toJSON implementation on the Checkit object, such that it filters out errors as you described (and re-throws as appropriate)... though you make a good point about the TypeError's and potentially exposing undesired information.

Perhaps TypeErrors or ReferenceErrors could be special cased and a generic message returned in their place rather than outputting the actual message from the error? I'm hesitant to require the use of a Checkit.ValidationError because it'd require you have Checkit around everywhere you'd create a custom validator.

Thoughts?

bendrucker commented 10 years ago

Another possibility would be to make use of the catch/error distinction in Bluebird. That adds another layer of complexity, but it's a hard problem. You have to hide error messages from unexpected exceptions but also provide the user with a way of passing expected exceptions for serialization.

I like the way Hapi deals with this in its request handlers. It runs the handlers inside a domain. Handler functions have the signature request, reply, where reply accepts a single result argument. result is either data or an error (instanceof Error). When reply is called with an error, this is considered an expected error and it's sent as the response. When something inside the handler throws an exception rather than passing it to reply, it gets sent as a generic 500 error. Domains work really well for this kind of situation and promises can be a pain.

One possibility would be to use a boolean property on the error called isCheckit. The lib would take care of intercepting all errors, re-throwing if !isCheckit, but re-casting where isCheckit && !(err instanceof Checkit.ValidationError. Checkit.ValidationError.prototype.isCheckit = true for consistency.

I know this sounds kind of kludgey but I've seen the is{Lib} pattern used before, especially since instanceof can create unexpected reference bugs if there's ever multiple versions.

jclem commented 10 years ago

I don't see requiring Checkit everywhere I'm building validators as an issue, given that I'm building validators for use by Checkit, anyway.

The TypeError/ReferenceError suggestion doesn't feel right to me, because they don't represent a state of invalidation, but rather that programmer error has occurred or something else has gone wrong in the environment. Really, this is the case with any error other than one I intentionally throw in order to trigger invalidation. I should be forced to handle these separately, and in fact I'd probably want them thrown immediately (rather than once every other promise has finished).

While @bendrucker's isCheckit suggestion feels a little wonky for some reason, I can see it as a viable alternative for people who really, really don't want to require Checkit when they're writing validators for Checkit (although I'm not sure when this would be the case).

Thanks so much for taking the time to discuss this, by the way.

bendrucker commented 10 years ago

Agreed that using a boolean property to match feels weird but it ends up being the cleanest solution in most cases, especially if Bookshelf eventually depends on Checkit for built in validation. If you then use Checkit independently, you might end up with two copies of the library. All of a sudden err instanceof Checkit.ValidationError might be false in certain cases when you expect it to be true.

This is another idea I'm borrowing from the Walmart Labs guys btw (spumko/hapi#971)

tgriesser commented 10 years ago

Thanks so much for taking the time to discuss this, by the way.

Sure thing.

Might need to think on this one a bit. The issue is that as promises act as async try / catch, needing to special case types of errors would also mean that you can't trust rejected promises as sources of validation failure either. For example lets say someone does:

// Ensure the email address exists.
new Checkit({
  email: ['email', function(val) {
     return new User({email: val}).fetch({require: true});
  }]
}).run()...

This returns a rejected promise from Bookshelf which is just an Error (yeah, I know we need to subclass error types there @bendrucker, promise I'll get to that once I wrap up this whole knex overhaul).

So by this logic you'd need to recast it to a Checkit.ValidationError just to be able to handle it, which seems a little cumbersome.

So while I see the reasoning that exposing sensitive material could be an issue, in practice I feel like folks will catch these issues before going to production.

I should be forced to handle these separately, and in fact I'd probably want them thrown immediately (rather than once every other promise has finished).

I think I'd be alright with looking for and throwing any of the built-in error types which aren't just Error - (ReferenceError, SyntaxError, TypeError, RangeError, etc).

I don't believe that would help with your first case ("Cannot find module './collection/uers'") but I don't know what the real solution here is without requiring every single error (even those from promises) be flagged somehow as an "error I want Checkit to display"... and yeah as Ben mentioned, hard instanceof checks can sometimes end up being quite misleading.

jclem commented 10 years ago

I didn't realize when I opened this issue that as long as I know I'm always throwing Checkit.ValidationError in my custom validators, I can do something like this:

user.save().then(function(user) {
  console.log('it worked');
}).caught(expectedErrors, function(err) {
  console.log('only caught Checkit.ValidationErrors');
}).caught(function(err) {
  console.log('caught something unexpected');
});

function expectedErrors(checkitError) {
  if (!checkitError.errors) {
    return false
  }

  var onlyExpected = true;

  Object.keys(checkitError.errors).forEach(function(key) {
    checkitError.errors[key].errors.forEach(function(error) {
      if (!(error instanceof Checkit.ValidationError)) {
        onlyExpected = false;
      }
    });
  });

  return onlyExpected;
}

When I opened this issue, I actually didn't realize Checkit.ValidationError was something thrown by Checkit, I kinda just made it up and happened to be right. As long as Checkit is consistent in that, and I can be, too, it's not a big deal.

bendrucker commented 10 years ago

Yup, check out the Bluebird API docs on catch: https://github.com/petkaantonov/bluebird/blob/master/API.md#catchfunction-errorclassfunction-predicate-function-handler---promise.

If you pass a function as the first argument to catch, it does two things:

1) Checks whether err instanceof fn 2) Evaluates whether fn(err) is truthy or falsy

You can also make expectedErrors more concise by using the functional methods on the error. See https://github.com/tgriesser/checkit/blob/master/checkit.js#L414

Subash commented 10 years ago

I would prefer to throw Checkit.ValidationError instead of plain error.


// 'unique:users:email'
Checkit.Validators.test = function(val, collection, field) {
     throw new Checkit.ValidationError('error message');
};
vimto commented 10 years ago

Hey @tgriesser, I just bumped up against this issue and wondered if you'd progressed your thinking on a way to handle this?

For the moment, I always throw Checkit.ValidationError in custom validations, and use Bluebird's catch predicate to ensure all other exceptions bubble up and don't leak out as validation errors.

It's working fine, but adds a little ceremony to an otherwise nice and clean API.

How are you handling it when using Checkit?

jclem commented 9 years ago

Just hit this again, wondering if there's a better way to do this yet @tgriesser. Looks like a function to check each individual error in a Checkit.Error instance is still the safe bet.

federicobond commented 9 years ago

I think the cleanest way to handle this is to always throw Checkit.ValidationError, then making sure that other errors are not collected inside Checkit and instead bubble up through the promise.

I wrote a small patch to support this behavior. Let me know and I can open a pull request.