jagi / meteor-astronomy

Model layer for Meteor
https://atmospherejs.com/jagi/astronomy
MIT License
604 stars 67 forks source link

Validating before saving #48

Closed steph643 closed 9 years ago

steph643 commented 9 years ago

I am trying to see if packages astronomy+validator can replace aldeed::Collection2 in my project. One thing I don't see clearly is how to validate any object that I intend to save to the database. By validate I mean "throwing an exception if the object is ill-formed". What are the best practices? Should I implement my own validateAndSave() function? Is there something I am missing?

lukejagodzinski commented 9 years ago

Hmm I don't understand your question. What do you mean by saying "any object". I assume that you mean astronomy object, right? Have you read wiki about performing validation?

The validate() method returns true / false value which can be used to throw exception if you want. However you don't need to throw exceptions. You can use these functions to get error message(s) and display them in the way you want.

Please, describe your case in more details.

steph643 commented 9 years ago

The feature I am talking about is:

A Meteor package that allows you to attach a schema to a Mongo.Collection. Automatically validates against that schema when inserting and updating from client or server code.

(full doc here, this is one of the most used Meteor package).

I know about astronomy validators. And I can imagine how to implement a somewhat similar feature. But as I think this is a very common need, I was wondering why this is not already part of the package.

If you left it to programmers on purpose, I will gladly implement my own validateAndSave() function. But I was so surprised it is not already part of the package, that I wondered if maybe there was a deeper reason, a design pattern that would make this feature unnecessary...

lukejagodzinski commented 9 years ago

There were few people that already wanted this feature. I will add this to the TODO list. I was wondering why people ever need this feature, and I only found one reason: modification of multiple documents at once. I will probably have to write big module for this. I haven't investigate this problem in details yet, so I can't give you any advice.

What is your use case? For what do you need this feature?

steph643 commented 9 years ago

My use case is input validation. In the following example, I want to ensure that "title" is no more than 100 characters long:

Meteor.methods({
  setPostTitle: function(postId, title) {
    check(postId, String);
    check(title, String);
    var user = Meteor.user();
    if (! user)
      throw new Error('user not logged-in');
    var post = Posts.findOne(postId);
    if (! post)
      throw new Error('post not found');
    post.checkRoleCanChangeTitle(user.role);
    post.set('title', title);
    post.validateAndSave();
  }
});

In fact, I would like _any_ call to save() to go through validation, so that I don't risk saving invalid data (here an invalid title) to the database.

lukejagodzinski commented 9 years ago

Oh I see, so the only thing you have to do is something like this:

post.set('title', title);
if (post.validate('title')) {
  post.save();
} else {
  post.throwValidationException();
}

That's it. It's documented in the validation wiki and here is an example app https://github.com/jagi/meteor-astronomy-examples

I hope that it solves your problem.

steph643 commented 9 years ago

Yes, that is what I thought, thanks.

However this lets me with an uncomfortable feeling. What if I forget to call validate()? Is there a use case where I need validators but don't want to call validate() before saving? If there is none, why does the save() function doesn't call validate() directly?

Obviously, there are things I still need to grasp about your package. I am probably in the wrong mindset, due to the fact that I am using aldeed:simple-schema, aldeed:collection2 and dburles:collection-helpers. If I decide to use your package, I will have to remove those three (it would be meaningless otherwise). So I am probably looking at your package from a biased standpoint.

jthomaschewski commented 9 years ago

You could just do the validation in a global hook.

e.g

Astro.eventManager.on('beforesave', function(e) {
  if (!e.target.validate()) {
      e.target.throwValidationException();
  }
});

I personally like the fact that validation is separated from the basic model layer method "save".

I'm just a bit curious if validation is intended also as a complete replacement for check()'s or if it should only be used in addition to regular check()'s in meteor methods

lukejagodzinski commented 9 years ago

Not forcing validation on every save was a conscious choice. There are situations (when being on the server) when you don't want to execute the whole validation process, which in some cases can take some time. For example you can take some data from the other server and you are sure that it's valid and you just save it.

However as @jbbr mentioned you can always use global hooks.

I think that the need to remember to validate a document before saving it, isn't reason to make it a part of the save process. There are always possibilities to improve code, so I will work on feature that will call validation on direct calls to collection.

steph643 commented 9 years ago

On one side, I like the separation between validation and save. But on the other side, I need to "protect" my database the way aldeed:collection2 does. And it is meaningless to have 2 schemas in my code... So +1 for a feature for validating any collection update.

But in my opinion, save() should use this feature by default (with a validate boolean argument that user could set to false if needed).

lukejagodzinski commented 9 years ago

I will think about the API. Right now I'm adding it to the TODO list with the high priority

lukejagodzinski commented 9 years ago

There is a problem with validation on insert and updates. It breaks my assumption that developer wants the whole object to be valid. Let's say, we insert a document with following schema:

User = Astro.Class({
  name: 'User',
  collection: Users,
  fields: {
    'firstName': {
      type: 'String',
      validators: [
        Validators.required(),
        Validators.string(),
        Validators.minLength(3)
      ]
    },
    'lastName': {
      type: 'String',
      validators: [
        Validators.required(),
        Validators.string(),
        Validators.minLength(3)
      ]
    },
    'email': {
      type: 'String',
      validators: [
        Validators.required(),
        Validators.string(),
        Validators.email(3),
        Validators.unique(),
      ]
    },
    'age': {
      type: 'Number',
      validators: [
        Validators.required(),
        Validators.number(),
        Validators.gte(18),
        Validators.lte(100)
      ]
    }
  },
  behaviors: ['timestamp']
});

However, we are only passing the firstName field.

Users.validateAndInsert({
  firstName: 'John'
});

Should it be valid? In my opinion, it shouldn't. It becomes even more complicated on updates. Let's say that we have validator that says that the lastName field has to be longer than the firstName field. And we want to execute following code:

Users.validateAndUpdate({
  _id: 'SOME_ID',
}, {
  $set: {
    lastName: 'Smith'
  }
});

In that case, we should first load the entire document into memory because we have a validator that depends on the other field. Loading a document before updating could be wrong idea. It makes me think that the entire idea of collection validation is not good.

steph643 commented 9 years ago

That is precisely where collection2 falls short: it tries to ensure database integrity (see here),something which cannot be done correctly without deep database system integration.

Let's discuss this more :-)

I don't understand your first example. I didn't know validateAndInsert could take an object as argument. To me, validateAndInsert takes no argument or a field name. Are you talking about validating an incomplete object? The "incomplete object" issue is not specific to database write. For example, what happens if you call validate() on this?:

var user = Users.findOne(userId, { fields: { firstName: 1 } });

About your second example, I didn't know you could have a validator which depends on other fields. How do I do that? collection2 does that in a horrible manner, because it does not distinguish between a field validator (which is local only) and an object validator (which can manage relationships between fields).

lukejagodzinski commented 9 years ago

Notice that in the first example I wrote Users.validateAndInsert() not User.validateAndInsert(). I called it on the collection. It's the feature that I'm working on. And yes I was thinking about inserting incomplete object.

If it goes about your example:

var user = Users.findOne(userId, { fields: { firstName: 1 } });
user.validate();

It will not pass validation because all fields are required. Right now to pass validation you would need to create quite complex validation rule:

validators: Validators.or([
  Validators.and([
    Validators.required(),
    Validators.string(),
    Validators.minLength(3)
  ]),
  Validators.null()
])

I think I have to introduce the required attribute in the field's definition which will act similarly to NOT_NULL option in SQL. On the other hand, right now you can validate document in following ways:

doc.validate(); // Validates entire document
doc.validate('firstName'); // Validates single field
``

and I could introduce one more syntax:

```js
doc.validate(['firstName', 'lastName']);

to validate multiple fields but not all.

If it goes about validation that depends on another fields, here is how you do it:

validators: Validators.gt(function() {
  return this.otherField; // The current's field value has to be greater than the "otherField" field's value.
})

validators: Validators.equalTo('otherField') // The current's field value has to be eqaul to the "otherField" field's value.

There are possible other combinations but I still have to improve it a little bit. I want it to work with all validators. Right now it's possible only with a few. It will work as showed above. When you pass function you can just compare current value with anything you want in the context of the given object.

steph643 commented 9 years ago

If it goes about validation that depends on another fields, here is how you do it:

IMO you have made the same mistake than collection2: you don't distinguish between a field validator (which is local only) and an object validator (which can take into account relationships between fields). If the programmer has to wonder whether he has to create a field1 validator defined as > field2 or a field2 validator defined as <= field1, then I think you got a oo design issue (i.e. your validator concept does not belong to the right class).

Maybe there should be two functions: validateEachField and validateObject. When inserting or updating the database, only validateEachField would be called. It would be up to the programmer to call validateObject if he wants to and if he got the complete object.

lukejagodzinski commented 9 years ago

I was following the concept from the Doctrine library (for PHP Symfony framework) however I've modified it a little bit. There was a concept of the Post validator where I can validate one field in the relation to another one but still using the same validation functions. Maybe I should split it into two groups in Astronomy. I have to rethink it.

doc.validateFields();
// Validates fields only that are present
doc.ValidateDocument();
// Validates entire document, so it needs to fetch entire document from the database
doc.validate();
// Runs validateFields() and validateDocument() one after another (to not break the api).

I probably shouldn't change a lot the way how validators are defined:

fields: {
  fieldName: {
    type: 'String',
    default: '',
    validators: [ // Field validator
      Validators.string(),
      Validators.minLength(3)
    ]
  },
  numberA: 'Number',
  numberB: {
    type: 'Number',
    default: 0,
    validators: [ // Field validator
      // It's not correct here to pass a function as an argument of the validator
      /*
      Validators.gt(function() {
        return this.numberA;
      })
      */
    ]
  }
},
validators: [ // Document validator
  Validators.gt(function() {
    return this.numberA; // It's correct.
  })
]

For now it's the only solution that comes to my mind that would not mess to much with the current approach.

steph643 commented 9 years ago

The proposed way to separate field and document validators looks very good to me.

Are you sure validateDocument() should fetch the whole document from the database? Why not keep the current behavior? People who want to validate the whole object can fetch it before...

serkandurusoy commented 9 years ago

I think a doc.ValidateDocument() makes more sense when doing inserts, but when doing updates, doc.ValidateFields() would make more sense. Perhaps it would even better if the fields to be validated are either implied (by looking at which properties of doc have changed) or optionally be passed as an array of fieldnames so that we validate only the fields that we want.

But there still are cases where validation depends on other fields that are not being updated. Let' say a field is only required if another field's value is a predefined value and is being updated without the dependency field. There may be even more complicated dependant validation scenarios therefore it is really hard to get this thing all right.

lukejagodzinski commented 9 years ago

@serkandurusoy

  1. Yes, it will work more or less the way you said. On inserts it will always validate entire document. On updates you will have choice, to validate: single field, multiple fields, all fields or entire document. It's almost implemented in coming update.
  2. Yes, validation of fields that depend on values of other fields is tricky. It can be solved thanks to getters in ES6 (no support for IE under 9 or 10 - I don't remember). Let me explain. Let's say that you have validator:
{
  firstName: Validators.maxLength(function() {
    return this.lastName - 1;
  }, 'First name has to be shorter than last name')
}

Now, when we get to the line this.lastName it calls getter function which checks whether given field had been loaded from the database. If not it tries to load entire document. Of course there are more problems like how to load entire document that have not been published to the client. It would work only on the server etc.

In my opinion, when someone is trying to validate entire document he/she should be aware of that it's needed to load it from the database.

Right now I'm preparing big update and release of v.0.13.0 will be a little bit delayed. There are going to be some minor changes in the API but after this release, the code base will be much simpler.

serkandurusoy commented 9 years ago

I'm not familiar with ES6 getters, so I cannot comment on that, but I'm glad to hear that you are heading in such direction.

About the changes, I'll be creating a new behavior that adds extra fields to the schema, should I wait for 0.13.0 or go ahead?

lukejagodzinski commented 9 years ago

I think you can go ahead. There won't be many changes in behaviors and even if there will be any it won't be hard to migrate to new version.

serkandurusoy commented 9 years ago

I'll be creating a behaviour for collations so that a field can be set to sort naturally, as in it gets sorted in that language. For example, an English sort would be a,b,c,o,u,v,w,x,y,z,ç,ö,ü but collated in Turkish, it would become a,b,c,ç,o,ö,u,ü,v,y,z,w,x.

Would you be interested in getting that into the official set of behaviours or should I go ahead and keep it private (or perhaps as a separate atmosphere package).

lukejagodzinski commented 9 years ago

Yes, it would be great to make it official module. As far as I know MongoDB does not support collation, or it already does? How do you want to achieve it?

serkandurusoy commented 9 years ago

It is a very simple idea; I use the collator class of the iLib i18n library to create sortKey values which are hex representations of the sort order of any given string in any given collation. I then binary encode and store it on mongodb along with the name of the collation.

I'm currently using that in a project with collection-hooks so that I can intercept beforeinsert/update to modify the doc. I also intercept beforefind/findone to detect the selector and check if it contains a collated field and modify the selector accordingly if it does. Therefore it is completely transparent to the user.

The only problem with porting that to astronomy would be the lack of beforefind/one intercepts. Do you think it would be possible for you to implement that as an event?

lukejagodzinski commented 9 years ago

Oh I see. I will investigate the iLib library.

Right now there is a QueryBuilder module for Astronomy, but it does not implement events yet. However, I will be working on the collection level validation and implementing events by the way will be easy. It will be available on v0.13.0 release.

So, maybe it will be better to wait for that version before creating this behavior :).

serkandurusoy commented 9 years ago

Sounds interesting. I haven't looked into query builder since its wiki page is basically empty https://github.com/jagi/meteor-astronomy/wiki/Query-Builder

OK then, I'll wait untill v0.13.0 gets released. In the meantime, I'll see if I can implement this on a project I'm kicking off right now. It would be a good prep to a package.

lukejagodzinski commented 9 years ago

Guys, I would like you to take a look at my last comment in this issue. I want to know your opinion.