svn-all-fast-export / svn2git

:octopus: A fast-import based converter for an svn repo to git repos
GNU General Public License v3.0
261 stars 100 forks source link

Cherry Pick merges in svn are shown a git merges in git #147

Open PeterGrandcourt opened 2 years ago

PeterGrandcourt commented 2 years ago

update: added missing merge step in the SVN operations.

Hello, I have an issue where svn merged of specific revisions to branches, are interpreted by the conversion as full merges. The equivalent of an svn merge of specific revisions is a cherry pick in git.

Is there a way of overriding this behaviour?

Take the following example

SVN operations

Set up the repo

Revision: 1
Message:
adding standard structure
----
Added : /branches
Added : /tags
Added : /trunk

add file_1.txt to trunk

Revision: 2
Message:
adding file 1
----
Added : /trunk/file_1.txt

add file_2.txt to trunk

Revision: 3
Message:
adding file 2
----
Added : /trunk/file_2.txt

Branch v_1 from trunk on revision 2

Revision: 5
Message:
creating version 1 from revision 2
contains file 1
----
Added : /branches/v_1 (Copy from path: /trunk, Revision, 2)

Merge revision 3 that added file_2.txt to v_1)

r6 
Changed paths:
   M /branches/v_1
   A /branches/v_1/file_2.txt (from /trunk/file_2.txt:3)

merging file_2.txt from trunk

Convert svn to git

rules

create repository test
end repository

match /trunk/
  repository test
  branch trunk
end match

match /branches/([^/]+)/
  repository test
  branch \1
end match

match /
end match

convert docker run --rm -it -vpwd/workdir:/workdir -v /u02/svnroot/test:/tmp/svn -vpwd/conf:/tmp/conf svn2git /usr/local/svn2git/svn-all-fast-export --identity-map /tmp/conf/authors-transform.txt --rules /tmp/conf/rules --add-metadata --svn-branches --debug-rules --svn-ignore --empty-dirs /tmp/svn/

check the results

clone and check out v_1

git clone --branch v_1 ./workdir/test/ ./workdir/test_gitwc
cd workdir/test_wc

look at the history of the v_1 branch

commit 34b16b76dd4984332662500f118220c58a8447c8 (HEAD -> v_1, origin/v_1)
Merge: 9dfb79f 21bdd65
Author: peter.grandcourt <peter.grandcourt@iongroup.com>
Date:   Fri Sep 2 08:24:08 2022 +0000

    merging file_2.txt from trunk

    svn path=/branches/v_1/; revision=6

commit 9dfb79fec5ee9f5dd02d9de78fdd8457c175c273
Author: peter.grandcourt <peter.grandcourt@iongroup.com>
Date:   Fri Sep 2 08:17:58 2022 +0000

    creating version 1 from revision 2
    contains file 1

    svn path=/branches/v_1/; revision=5

commit 21bdd65a736d2f8f55eb6d7b1bd4d9d8c6b7cb62
Author: peter.grandcourt <peter.grandcourt@iongroup.com>
Date:   Fri Sep 2 08:13:44 2022 +0000

    adding file 2

    svn path=/trunk/; revision=3

commit cd3ecc5a43ce0127471236dbdad2af932a57b66e
Author: peter.grandcourt <peter.grandcourt@iongroup.com>
Date:   Fri Sep 2 08:10:22 2022 +0000

    adding file 1

    svn path=/trunk/; revision=2

expected

svn revision 3 adding file_2 to the trunk should not be visible in branch v_1 in git this is a revision from the trunk and we didn't merged this revision in svn.

actual

svn revision 3 adding file_2 to the trunk is visible in branch v_1 in git does svn-all-fast-export interpret all merges as full instead of performing a cherry-pick?

PeterGrandcourt commented 2 years ago

The consequence of this problem is that the contents of a branch does not match the history of the branch. I think this is a critical issue as if you cannot rely on branch history, and you care about branch history, then the converted repository is unusable. If, on the other hand you don't care about history, then you don't need convert a repository with all its history.

For most people migrating svn, with a standard ( or relatively standard repository) svn git will be a better choice as that records the branch history correctly.

svn git however is problematic if repository is very big (it is quite slow) and if you have moved branches around (including moving trunk around) or just have a lot of branches then the follow parents (default behaviour) causes it to recurs the history until it finds the parent. If you have many branches, and a big repo then this takes a lot of time. Switching off follow parents causes means that branch history is not connected to branch-parent history.

svn-all-fast-export is a great tool, but this problem makes it unusable at the moment for some use cases and I hope this issue can be resolved.