Closed larrifax closed 2 years ago
Right, createPatch
expects POJO arguments, and doesn't gracefully handle fancy objects that are more than the sum of their parts.
As far as rfc6902
is concerned, Date
is a special case of the more general category of things that have typeof x == 'object'
but cannot be iterated and rebuilt by key.
Maybe Date
is a common enough type to merit special handling, but I'd prefer a more general solution that sends all fancy objects directly to diffValues()
. I could do this with a brute force input = JSON.parse(JSON.stringify(input))
but that would incur a lot of (usually) unnecessary processing overhead.
I'm thinking something like
const isPOJO = Object.getPrototypeOf(input) == Object.prototype
or
const isPOJO = input.constructor == Object
which appears to work for Date vs. literal objects, but I would need to do more testing on the generality / backward compatibility of this approach.
Thanks for looking into this. I agree that converting the value to & from JSON incurs too much extra processing overhead in the normal case. However, the two other examples you give would prevent key-diffing of ES6 classes (and probably ES6 classes transpiled to ES5), which we don't always want (?)
class Test {}
const isPOJO = Object.getPrototypeOf(new Test()) == Object.prototype // false
I imagine the exceptions to the typeof x == 'object'
-rule could differ from project to project. Could some kind of "fancy object" whitelist be an option?
I think the option should be "best effort with user classes" which is what it currently naively does, since it assumes the user isn't going to send in any instances of custom classes, vs. "consider all user classes completely opaque." The latter is more sane, because the following makes sense but is probably not what the user intended:
const {createPatch, applyPatch} = require('../rfc6902');
class Dog {
constructor(breed) { this.breed = breed; }
bark() { console.log('woof!'); }
}
class Car {
constructor(make, model) { this.make = make; this.model = model; }
honk() { console.log('beeeep!'); }
}
var dog = new Dog('dachshund');
var car = new Car('bmw', 'm3');
var dog2car = createPatch(dog, car);
console.log('dog -> car:', dog2car);
var cardog = dog;
applyPatch(cardog, dog2car);
console.log('cardog:', cardog);
// cardog Dog { make: 'bmw', model: 'm3' }
cardog.bark();
// woof!
cardog.honk();
// TypeError: cardog.honk is not a function
I don't want this library to get into the business of extending RFC6902 to support any enhanced (non-JSON) JavaScript types, but it should be easier for the user to drop in their own diffObjects (or diffArrays, etc.) functions without having to fork or otherwise rewrite large portions of the library.
So my plan is to:
createPatch
that percolates to diffAny
(and probably all the diff*
functions that are recursive) that, when set, specifies the current key-diffing behavior on user class instances, but by default considers them opaque, so that you wouldn't have hit your original surprising behavior on Date
.createPatch
creator function so that individual diff functions can be rewritten easily based on users' project requirements.Your plan sounds good. I like the way the library currently handles custom classes, but I do agree that it would be safer to consider them opaque. I'm happy as long as I'm able to hook into the createPatch-function in order to enable key-diffing on custom classes while simultaneously keeping the opaque diff on e.g. Date objects.
Hi, any progress with that issue? I came across same problem yesterday and I'm looking for any solution that wouldn't require rewritting much of my code.
JSON.parse solves such specificity with a reviver
function parameter which is often used to convert JSON dates into Date objects. You need such a hook for any specific key diffing.
This issue has been open for more than 3 years. It would be nice to solve it quickly as date usage is very frequent. Date object is a native javascript object, that could be treated specifically as it is so often used. No need to complicate solving of this issue. It will be time later to have a generic solution for all particular uses; but date does not deserve such a complication.
I'm almost sympathetic to adding a special case for Date
instances... except that dates don't survive a roundtrip though JSON :(
var now = new Date()
now == now
//⇒ true
JSON.parse(JSON.stringify(now)) == now
//⇒ false
A special case at the top of diffObjects
wouldn't feel like too much of a hack, but emitting operations with a value that doesn't survive JSON-serialization still feels icky, even 3 years later.
I've just added some documentation to the README about customizing the diff function via createPatch
's optional diff
argument (new as of v2.4.0
, released on 2018-06-25, partly with this issue in mind).
To give a concrete example for Date
instances specifically:
const customDiff: VoidableDiff = (input: any, output: any, ptr: Pointer) => {
if (input instanceof Date && output instanceof Date && input.valueOf() != output.valueOf()) {
return [{op: 'replace', path: ptr.toString(), value: output}]
}
}
const input = {date: new Date(0)}
const output = {date: new Date(1)}
createPatch(input, output, customDiff)
//⇒ [ { op: 'replace', path: '/date', value: new Date(1) } ] // as expected 👍
That's not the solution this thread's commentators are asking for :/, but I'm not convinced that adopting non-JSON
izable values into this library's default behavior (i.e., without requiring the user to do a bit more work to bring them in) is worth the potential headaches.
I'd love to hear any feedback on whether I'm being paranoid re: headaches, etc.
I'm also leaving this open since it is clear to me that compare(new Date(0), new Date(1))
returning true
isn't ideal. I'll have to think about what to do in that case... maybe add an unsafe
argument to createPatch
that defaults to false
and throw an exception on encountering "objects that are more than the sum of their [properties]" (somehow!?) unless unsafe==true
.
Obviously this is a challenging issue to solve cleanly, and being able to supply a custom diff on the createPath method (as you show above) solves root-level date properties, but it doesn't accommodate the case where perhaps you might have an array of objects on the root object that have a date property.
RootObject --> id, other root properties (name, description, whatever) --> Collection of ChildObject ------> ChildObject (having string, string and date properties)
As you articulate, the compare() is where the two different dates are seen as equal. So what we're seeing is that when one just wants to change a date value on an item in a root object's array it's not seen as a change. compare() is invoked for each of the child object properties and it doesn't pick up the date change.
I'm trying to think about the best way to work around this concern, so if anyone has come across this concern and has a nice workaround I am very curious to hear it.
Editing to add - curious if you had considered allowing createPatch to supply an override to compare() rather than supplying a voidableDiff? I'm probably missing something.
@Blackbaud-ChristopherHlusak you're totally right 🤦♂ and even more succinctly:
createPatch([new Date(0)], [new Date(1)])
[] /* <- wrong! 😞 */
I gloat about my fancy Levenshtein array diffing in the README, but that's how I've shot myself in the foot here, because compare
is naive to the relevant customization implemented in the diff
function passed to createPatch
, and thus behaves incorrectly. If diffArrays
didn't try to be so smart and was more like diffObjects
— only doing brute-force pairwise comparisons — we wouldn't be having this problem.
I'll have to think about this a bit before I decide on a solution, but my intuition is that the compare*
family is the problem, and that it should delegate all logic to the diff*
family. And if after doing that there is still any need for a general purpose compare(a: any, b: any) => boolean
function, it could be implemented as diff(a, b).length === 0
(i.e., the fully customizable diff
returns no patches). My main concern is that this might kill diffArray
's performance, or end up doing a bunch of duplicate work, but I agree that the current behavior is incorrect and I'll figure out how to fix it.
It's definitely been a crazy two months since our last exchange, but I wanted to ask if you have had an opportunity to look into this issue any further? It's conceivable we might dig into it ourselves a little bit, too - I'm sure it seems like such a small thing at this time.
Nice library though, it's generally worked quite well outside of this small issue!
Hi @Blackbaud-ChristopherHlusak unfortunately I got sidetracked 😬 but thanks for your reminder! Revisiting this yesterday/today I implemented the solution sketched out in my previous comment, which at least fixes the breaking test added in 88fae08 (and it would be trivial to add a conditional branch like if (object instanceof Date) { return 'date' }
to diff.objectType
which would handle the case in https://github.com/chbrown/rfc6902/issues/15#issuecomment-593519853), but the concern I mentioned remains.
I'm not sure how detrimental it would be IRL, but I can imagine pathological cases where having some large objects/arrays nested in other objects/arrays could dramatically tank performance compared to current (master
branch) behavior. Ideally, diffArrays
and diffObjects
would be iterators, and the new !diff(a, b).length
could short-circuit on the first result, but this could be tricky to implement inside the Levenshtein logic, not to mention the iffy cross-platform compatibility of JavaScript generator functions.
For that reason, I'd probably consider the potential performance regression caused by the replace-compare-with-diff
branch a breaking change, and bump the major version if I released it. Same if I resolved the performance regression issue by incorporating something relatively fancy like iterators / generators.
I'm open to suggestions, but at the moment my inclination is to meet somewhere in the middle and move the equal.{compareArrays,compareObjects}
functions parallel to each of the diff.{diffArrays,diffObjects}
functions and pass in the custom diff
argument just like with the diff*
functions.
Sorry for the delay, I just noticed your reply - I can carve out some time in my next sprint or the one afterwards to test & examine - you obviously have a more solid handle on this than anyone, but I'm definitely intent on helping examine & verify your solution.
For what it's worth, the object sizes we've been dealing with I can't imagine any sort of a performance concern whatsoever... I haven't specifically tested how long it takes to calculate patch operations for our objects but I always just dismissed it as an in-memory sort of calculation that would be negligible (your concern suggests maybe I should pay more attention to the speed).
I'm adding a reminder for myself to follow-up next sprint and I truly appreciate you taking the time to look into the issue.
Hello! I am attempting to npm install that branch and it looks like I'm getting an error w/ util.ts that is preventing the installation. This file hasn't changed w/ that branch; I have not previously done an npm install to a specific branch so it's possible I'm not doing it correctly.
npm ERR! prepareGitDep util.ts(20,27): error TS2339: Property 'length' does not exist on type 'never'.
I'm looking further to understand, but thought I would mention it along with just mentioning I'm looking to verify that the changes work across the various patch use cases we have.
I'm attempting the following: npm install --save https://github.com/chbrown/rfc6902#replace-compare-with-diff --verbose
I made a fork and just worked around the typescript compilation issue (which wouldn't be a problem as you normally release), leveraged that fork, and it looks like your fix works for all of the current scenarios we use in the application. For what it's worth, this looks good to me so far.
Finally (!) getting back to this and hopefully (😅 ) finishing it out...
replace-compare-with-diff
branch, I split out the non-potentially-regressive (= uncontroversial) changes and pushed those to master
: https://github.com/chbrown/rfc6902/compare/208d10f...v3.1.0typescript
dependency not being pinned and upgrading from 3.8.*
to 3.9.*
— not a big deal; fixed in c200e45 by a simple type assertion.replace-compare-with-diff
as rebased onto v3.1.1
, which clearly depicted the trade-off between the the simple solution of replace-compare-with-diff
and the potential performance regression described in https://github.com/chbrown/rfc6902/issues/15#issuecomment-626090712. Observing that compare
was only used in two places: (1) inside diff.diffArrays
(to test whether a pair of elements in the array are equal) and (2) inside patch.test
, and facing the prospect of basically reimplementing the compare{,Arrays,Objects}
family as lazy_diff{Any,Arrays,Objects}
(or w/e naming scheme) companions to the existing diff{Any,Arrays,Objects}
family, I decided it wasn't the worth retaining the additional complexity, especially since it'd be nearly identical code. But the performance concern remains, and I deleted the entire equal
module, which IMO constitutes a major version bump.tl;dr: I chopped up the changes FKA the replace-compare-with-diff
branch into some commits leading up to v3.1.0
(released—after a few other tiny fixes—as v3.1.1
), and a culminating commit fixing the issue, but carrying the potential performance-regression-inducing change, resulting in v4.0.0
(released just now).
Appreciate the effort, we'll incorporate the latest into our solution during our next sprint. Thanks much.
1+ year out and no complaints so I assume this is resolved.
I'm trying to call
createPatch()
on dates (or objects containing dates), but the diffing algorithm doesn't seem to detect differences between Date-objects.I guess the problem lies here, as
typeof new Date() === "object"
. Would it be possible to add a check for Date-objects and route them to thediffValues()
-method?