neumino / thinky

JavaScript ORM for RethinkDB
http://justonepixel.com/thinky/
Other
1.12k stars 128 forks source link

thinky modifies data used for model operations #321

Open mbroadst opened 9 years ago

mbroadst commented 9 years ago

In this example:

var userList = [
    { username: 'arthur', email: 'arthur@gmail.com' },
    { username: 'james', email: 'james@gmail.com' },
    { username: 'henry', email: 'henry@gmail.com' },
    { username: 'william', email: 'william@gmail.com' },
    { username: 'edward', email: 'edward@gmail.com' },
    { username: 'arthur', email: 'aaaaarthur@gmail.com' }
  ];

  return User.save(userList);

Thinky does not clone the incoming data, and therefore operates directly on these documents. This should be a pretty easy fix, however, when I tried to use util.deepCopy in Model.prototype.save it does not seem fully up to the task. In sequelize we just use lodash's _.clone, but I imagine you're not too keen on adding a dependency to lodash.

While this might not seem like a huge issue, it can create subtle and hard to trace errors. I'm currently writing a rest wrapper around thinky and have run into myriad problems with this as part of my unit testing.

neumino commented 9 years ago

I think this is working as intended but for maybe the wrong reasons.

Typically if you call new Model(value), we won't make a deep copy of value. This is because

And in the end we kept the same behavior for Model.save.

var value = {...}
var doc = new Model(value);
// doc === value; // true
doc.save().then(function(result) {
  // doc === result === value
})

// Other script
var values = [ { ... } ]
Model.save(values).then(function(result) {
  // result[0] === values[0]
})

We could add an option to change the behavior though. I can see how it can lead to tricky issues like you said.

mbroadst commented 9 years ago

@neumino yeah, I can definitely _.cloneDeep() the data before passing it in, but it just feels like a bad smell to me that the default behavior of this module is to have such an unexpected side effect.

IMHO the default behavior should be the clone the incoming data, with an option to "do it tricky for the sake of saving a few kb of memory" :smile: I can't imagine node is that slow at iterating over the documents, if this is indeed a bottleneck in your code then by all means use the fast route right?

neumino commented 9 years ago

@mbroadst -- If you create a document with many keys (let's say a thousand, but I'm pretty sure that a hundred is enough), v8 switch to a hash object and iterating is really slow. You can try running JSON.stringify(<large object>) and you should be able to see that.

That being said, I'm on board considering that it can be tricky. Let's add an option :)

marshall007 commented 9 years ago

I'm with @mbroadst in thinking we should clone by default. @neumino from your example, intuitively it makes no sense that doc === value. Even if I assumed new Model(value) would manipulate value, I definitely wouldn't expect the prototype to have been changed.

Currently, you do a deep copy only if doc instanceof Document. I think this is actually the one case where we could get away without deep copying, but we should deep copy by default in all other cases.