eXist-db / exist

eXist Native XML Database and Application Platform
https://exist-db.org
GNU Lesser General Public License v2.1
429 stars 179 forks source link

Strange performance difference between xmldiff:compare() and deep-equal() #4050

Open xatapult opened 3 years ago

xatapult commented 3 years ago

Describe the bug We have tried to do XML compares using xmldiff:compare() and the XPath equivalent deep-equal().

For comparing two equal documents of ~200Kb xmldiff:compare() on my machine taks 60msecs, deep-equal() over 9 seconds!

Expected behavior The functions vary slightly in interface, but just comparing two documents should roughly take the same amount of time. And if not 9 seconds is a bit too much I think.

To Reproduce Unzip, import in eXist, execute test-equals.xql (I'm doing this from oXygen)

Context (please always complete the following information):

Additional context

xatapult commented 3 years ago

TEST.zip

dizzzz commented 3 years ago

AFAIK for xmldiff:compare() two files are serialised and compared with an external tool; I suspect that deep-equal() uses a different mechanism and might compare more?

line-o commented 2 years ago

I have recently played around with fn:deep-equal#2 and am wondering what it actually should do. It does not do well comparing two node-sets from my testing. I ended up serialising both elements and calling deep-equal on that.

PieterLamers commented 2 years ago

@dizzzz @adamretter I just heard that there are plans to ditch xmldiff:compare. I'd really hope for this issue to be looked into before this is effectuated.

adamretter commented 2 years ago

@PieterLamers I don't think we would intentionally remove it if there is not a good replacement. However, please remind me as we prepare eXist 7, so we don't remove it by accident as part of our tech debt cleanup

adamretter commented 2 years ago

@PieterLamers So we definitely are not removing xmldiff:compare - as we just added a new implementation of it here - https://github.com/eXist-db/exist/pull/4554

adamretter commented 2 years ago

@xatapult I reran your tests on my laptop:

  1. With the new implementation (#4554) of xmldiff:compare: 228ms
  2. With the new implementation (#4529) of fn:deep-equals: 12871ms

It looks like we would still need to optimise fn:deep-equals...