Open onmyway133 opened 6 years ago
Re benchmark: it looks flawed to me because you generate 1 array with 10000 identical elements and another one with 2 kinds of elements with 9900 and 500 copies respectively. It should probably be something more distributed with more kinds. You can use random alphabet letters for instance and run it a few times so that they are distributed.
There are over 13,000,000 elements in the traces array. Removing "bad" traces early is a badly needed optimization here.
Btw it's interesting that the simple algorithm (that you wrote) performs the best. I don't have time to profile it but some profiling would be interesting. I guess you use far fewer allocations and maybe avoid copies? Asymptotically looking this library should be faster but the devil lies in details.
ListDiff
doesn't generate a minimal diff so it's kind of a bonus if we look at it strictly (not comparable in the benchmark). It should be the fastest (linear IIRC).
That said I hope that people don't use hogs but I'm disappointed that this library couldn't handle mere 20000 elements.
EDIT: I misread. You're using Heckel, great!
@wokalski Thanks for the reply. About DeepDiff, I just followed this note https://gist.github.com/ndarville/3166060. It is based on 2 assumptions about line occurrences, so I have a table of entries to keep track of if a line appears once in each collection.
@onmyway133 I know this report is a little long in the tooth, but I tried to repo it and could not with the following code:
func testBadPatch3() {
let old = Array(repeating: UUID().uuidString, count: 10000)
var new = old
new.removeSubrange(100..<200)
new.insert(
contentsOf: Array(repeating: UUID().uuidString, count: 500),
at: new.endIndex.advanced(by: -20)
)
let p = patch(from: old, to: new)
let result = old.apply(p)
XCTAssertEqual(result, new)
}
Any ideas? Do you think if I run it enough times I will hit it?
@rayfix Hi, I'm have no ideas since it's long time ago 😅 , but those are the code I run when I did the benchmarking. And did you run tests on a device?
Hi @tonyarnold , I'm playing with different diffing algorithms, and want to give them some benchmarks https://github.com/onmyway133/DeepDiff#among-different-frameworks. When comparing
Differ
on some data sets, I get a crash. I run on device iPhone 6, iOS 11, Xcode 9.2. Maybe the datset is wrong, but it runs fine with many other diffing framworksHere is how such a data set gets generated