pouchdb / upsert

PouchDB plugin for upsert() and putIfNotExists() functions
Apache License 2.0
149 stars 25 forks source link

diff functions can mutate local vars #7

Closed nolanlawson closed 8 years ago

nolanlawson commented 9 years ago

As pointed out by @slang800, this test will currently fail.

    it('should not mutate local vars', function () {
      var newDoc = {_id: 'foo', value: 5};

      var doUpsert = function () {
        return db.upsert('foo', function (doc) {
          if (doc.value !== newDoc.value) {
            return newDoc;
          } else {
            return false;
          }
        });
      };

      return doUpsert().then(function () {
        newDoc.value = 6;
        return doUpsert();
      }).then(function () {
        should.not.exist(newDoc._rev);
      });
    });
nolanlawson commented 9 years ago

To be honest, I'm not sure if the cure (deep cloning) isn't worse than the disease (users may be surprised by local vars being changed). How serious was this in your codebase @slang800 ?

notslang commented 9 years ago

It was something that contributed to a much messier error that resulted in the diff function getting called repeatedly, forever. But for that to happen again, the user would also need to be able to mutate the _rev field and have it passed to tryAndPut without checking.

So the error that I had is gone, but there's still a potential for mutation of documents returned by the diff function to cause confusing errors (like an _rev/_id property showing up in a document where it didn't exist before).

tyler-johnson commented 9 years ago

Is this still an issue?

Seems to me the solution is to shallow clone the document that comes out of the diff function. Upsert is only modifying _rev and _id properties. Everything else is up to the diff function, which is in control of the user. If they want to modify deep properties, they should deep clone the object first.

nolanlawson commented 9 years ago

I've said it elsewhere, but I'm really reluctant to add any cloning here. When profiling PouchDB, I constantly see that cloning/extending is a big source of performance problems. I would much rather users just promise not to do weird mutate-y things with these functions rather than clone defensively all over the place, especially given that pouchdb-upsert is frequently called within PouchDB itself.

tyler-johnson commented 9 years ago

I would much rather users just promise not to do weird mutate-y things with these functions rather than clone defensively all over the place

While I agree with this sentiment, I think this makes a big assumption about developers. Most are going to do exactly what you don't want them to do. The biggest problem is that functionality like this becomes hard for the programmer to reason about, since a lot is happening they cannot see. The truth is that I would rather have a predictable, but slow program over an unpredictable, fast one.

Perhaps we could add a small if statement that checks if the doc returned is not the doc passed before cloning? This way we can skip the cloning IO if it was original document getting manipulated by the diff function.

if (doc !== newDoc) newDoc = clone(newDoc);

Otherwise, a line and an example should be added to the documentation so people don't keep making this mistake.

nolanlawson commented 9 years ago

I would much prefer to just document it. In PouchDB there are lots of places where we clone in order to protect developers from themselves (e.g. bulkDocs()), but this is a plugin, so I feel comfortable setting some hard rules about how people ought to use it.

notslang commented 9 years ago

I feel like this article is relevant here: http://blog.codinghorror.com/falling-into-the-pit-of-success/

We shouldn't assume that people will read the docs... we should either deal with mutations, or at least throw an error when someone does something the wrong way.

nolanlawson commented 9 years ago

Yeah, I normally feel like APIs should be as "it just works" as possible. Just in this case I don't want to add a clone().

Another issue is that this module is depended upon by PouchDB itself while simultaneously being a plugin. Maybe I should break it up into two versions, and for the version used by PouchDB, I can remove the clone.

notslang commented 9 years ago

The 2nd option (throwing an error when someone does something the wrong way) is still possible if you make the object immutable with something like Object.freeze().

nolanlawson commented 9 years ago

Once https://github.com/pouchdb/pouchdb/issues/4144 is fixed, I'll be happy to add defensive cloning to this plugin. I just didn't want the performance drag inside of PouchDB itself.

nolanlawson commented 8 years ago

If anyone cares to write a pull request for this, this is up for grabs now.

olso commented 8 years ago

@slang800 is infinite looping described in https://github.com/pouchdb/upsert/issues/7#issuecomment-76011108 related to this issue or is there another issue that links to this behaviour

@nolanlawson @tyler-johnson

nolanlawson commented 8 years ago

Closing because I've decided defensive cloning is really not worth it. It's not clear what kind of cloning is best (how to deal with Blobs/Buffers in _attachments?), and the perf cost is a big penalty for a minority of users who will run into issues. If anybody would like to defend themselves against mutations, they are free to fork this project.