chbrown / rfc6902

Complete implementation of RFC6902 in TypeScript
https://chbrown.github.io/rfc6902/
322 stars 39 forks source link

The copy operation should be a deep-copy #38

Closed enobufs closed 6 years ago

enobufs commented 6 years ago

Creating a new issue to continue the discussion (#37) ; "whether copy object by reference or by value"...

Your last comment:

Your second point, about copying by value rather than reference, is a much deeper change. 
I should certainly be more systematic in the documentation about which functions treat data
as mutable, or operate on references rather than values. In general, I agree that a deep copy
would be preferable. Practically speaking, though, this library tends toward mutable over 
immutable, and reference rather than value; going immutable would require some pretty
major API changes, and might (?) incur some heavy performance penalties (or at least
require depending on Immutable.js or similar).

I don't really think it would be perceived as a major API change, but I agree, it requires a careful thought.

When copying an object, I think majority of people would expect the object is "replicated". Copying by reference is rather a "special case", IMO.

If you think about "ownership" of the object to be copied:

In "JSON" (as in object notation in a text), there no such thing as "reference". One object (notated) in JSON, always has one owner, or a parent, except the root - no parent. (called tree; acyclic connected graph) This ownership problem arises ONLY in the Javascript land. RFC 6902 is about JSON. That's why I think "copy-by-value" (keeping the "one owner principle", or I should say, the way JSON can express) is more general (natural), and copy by reference would be a "special case", if you still need to consider it.

So, I would advocate that the copy operation be done by value (deep-copy, to be more specific). We could offer an option to do copy-by-reference, but I wouldn't use it because it would be very error-prone as you can imagine.

enobufs commented 6 years ago

Let me give you one example.

Say, I have this object:

{
    current: { timestamp: 23, throttle: 0.57, ...(many other properties)... },
    history: []
};

And, I'd like to keep a snapshot of the property, current into history array, using RFC 6902 patches applying occasionally to the target object like this:

let patches = [
    // Take snapshot into history
    { op: 'copy', from: '/current', path: '/history/-' },
    // Update timestamp
    { op: 'replace', path: '/current/timestamp', value: 24 },
    // Take snapshot into history
    { op: 'copy', from: '/current', path: '/history/-' },
    // Update timestamp
    { op: 'replace', path: '/current/timestamp', value: 25 }
];

Apply those patches at the remote object.

rfc6902.applyPatch(o,  patches);

Expected (deep copy):

{
    current: { timestamp: 25, throttle: 0.57 },
    history: [
        { timestamp: 23, throttle: 0.57 },
        { timestamp: 24, throttle: 0.57 }
    ]
}

Actual (by current implementation, "copy-by-reference"):

{
    current: { timestamp: 25, throttle: 0.57 },
    history: [
        { timestamp: 25, throttle: 0.57 },
        { timestamp: 25, throttle: 0.57 }
    ]
}

The timestamps in the history array are all modified unintentially because the copy was done by-reference with the current implementation.

Above is, not only hard to realize how it went wrong, but also workarounds could be ugly. We would have to make a deep-copy of the current property outside rfc6902 module, then use add operation to add it to the history list, which would defeat the purpose of using copy-by-reference for performance reason thought initially.

chbrown commented 6 years ago

Thanks for the new issue and further discussion. Immutability certainly improves referential transparency, and is generally more in line with the "principle of least astonishment." It'd be nice if there were some lightweight library of copy-on-write data structures that implemented the same API as the built-in Array and Object (as a hashmap data structure), but Immutable.js weighs in at 16KB (minified+gzipped), which risibly dwarfs rfc6902's 3KB (minified+gzipped).

On the other hand, I feel that JavaScript's overwhelmingly mutable data structures, along with its pandora's box of other foibles, sets a pretty low baseline for astonishment. And rfc6902's existing API doesn't buck that trend, i.e., it doesn't try to smooth over JavaScript's warts.

But, you make a good case for deep copying when applying the copy operation — I agree the extra work is warranted. I've not done the work yet, but I started looking around; the following is mostly a note-to-self for when I get back to this.

Top deep-copying libraries:

enobufs commented 6 years ago

FYI: https://jsperf.com/deep-copy-vs-json-stringify-json-parse/103