frej / fast-export

A mercurial to git converter using git-fast-import
http://repo.or.cz/w/fast-export.git
808 stars 255 forks source link

Simplifications of get_filechanges #299

Closed felipec closed 1 year ago

felipec commented 1 year ago

This code hasn't been touched in decades, but most of it is not necessary. In git-remote-hg we've been using a simplified version of get_filechanges+split_dict for many years (maybe a decade?).

All that is needed is to compare the files of the new commit with the ones of the first parent.

To test the different versions of the code I wrote a benchmarking tool filechanges-perf. The new version of the code is reliably faster, and produces the same resulting repos. In the case of hg it's almost twice as fast.

I tested with hg, mozilla-central, and hg-git:

hg '0:10000':
3cd697c5eccb477a29f9570d2b829d8ee807b0cc
0: 46.59
3cd697c5eccb477a29f9570d2b829d8ee807b0cc
1: 25.71

mozilla-central '0:10000':
c027d5531d03c29b879ca69f58d3e8997e8ae7e8
0: 113.60
c027d5531d03c29b879ca69f58d3e8997e8ae7e8
1: 102.37

hg-git '0:tip':
a4c5b8d1e35739501a7141b844fc4eaea57b6c5a
0: 3.25
a4c5b8d1e35739501a7141b844fc4eaea57b6c5a
1: 2.63

Same output, faster execution, and simpler code.

frej commented 1 year ago

I like this and also #298. I will merge #298 first, then convert my (not published) smoke test to Sharness, merge that and then merge this PR. If this PR results in NFC, the smoke-test should produce the same result.

frej commented 1 year ago

@felipec: I have now merged both @298 and #299, thank you for your contribution!

felipec commented 1 year ago

Great! I presume you tried to clone different repositories and the result was exactly the same.

frej commented 1 year ago

Yes, I tested with the gmp, hg, lemon, nginx, pidgin, and xine-lib repos. git fast-export --all | sha1sum produces the same hash for both versions.