chrisjbillington / hg-export-tool

A tool to convert mercurial repositories to git ones locally, working around some deficiencies in github's importer and in `hg-fast-export`
GNU General Public License v2.0
21 stars 9 forks source link

Program complains of uncommited changes when migrating a repository #9

Open juan88 opened 4 years ago

juan88 commented 4 years ago

Hello I was able to export a bunch of repos with this tool after resolving this issue #8, but I'm getting another error when trying to migrate a bigger repo. This time I have to use a branch map file because I don't want to turn mercurial's default into master since we're not using that branch.

I'm running:

python3.5 exporter.py repomap -B branch.map

And the output is:

Initialized empty Git repository in /tmp/repo-e8c7fddeb3204428e55979c4df3fef64/.git/
abort: uncommitted changes
(commit or update --clean to discard changes)
Traceback (most recent call last):
  File "exporter.py", line 255, in <module>
    main()
  File "exporter.py", line 251, in main
    BASH
  File "exporter.py", line 175, in process_repo
    amended_commits = fix_branches(hg_repo_copy)
  File "exporter.py", line 120, in fix_branches
    subprocess.check_call(['hg', 'up', head['hash']], cwd=hg_repo)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['hg', 'up', 'ab97f3427924cf7bd11b53e4a09332bb261fdefe']' returned non-zero exit status 255

After seeing this error message, I double-checked the status of the repo by running hg up --clean but the repository was clean: 0 files updated, 0 files merged, 0 files removed, 0 files unresolved

I've also tried migrating with a fresh checkout of the repository and withouth passing the branch mapping file and the error is the same.

I'm running python 3.5.2 and mercurial 5.4.1 and Linux Mint 18.1.

Thanks!

juan88 commented 4 years ago

I could really use some help! bitbucket will deprecate my hg repo at the end of the day :(

chrisjbillington commented 4 years ago

Sorry, I managed to miss this.

It sounds like as per the error message, you might have uncommitted changes, have you tried to verify that you don't?

This tool is not designed to migrate uncommitted changes. If you have them, make a commit just for the sake of doing the conversion - you can always do a git reset to turn them back into uncommitted changes once you're in git-land.

I'm about to get on an international flight so will not be able to help today I'm sorry. But so long as you have a local backup of the mercurial repositories you won't lose repo data.

I see there is a flurry of activity in the bitbucket mercurial deprecation thread today...wonder if they might hold off deleting them a bit longer, many people about to get screwed.

juan88 commented 4 years ago

Sure no problem!

I have triple checked that there weren't any uncommited changes. I've also tried checking out a fresh copy of the repo but without luck both times.

chrisjbillington commented 4 years ago

Perhaps the code that amends anonymous heads/bookmarks to change their branch name is somehow leaving uncommited data. Maybe you could add the following printline:

diff --git a/exporter.py b/exporter.py
index 50af2b3..c39325b 100644
--- a/exporter.py
+++ b/exporter.py
@@ -112,6 +112,7 @@ def fix_branches(hg_repo):
             heads_to_rename = [head for head in heads if head['topological']]
         counter = itertools.count(1)
         for head in heads_to_rename:
+            print(head)
             if head['bookmark'] is not None:
                 new_branch_name = head['bookmark']
             else:

and show the output? I wonder if it is failing on the first head it tries to amend, or on a subsequent one.

One could add --clean to the hg up command to ignore these uncommitted changes, but I would be concerned something is wrong such that the result might not be right.

The branch map shouldn't be relevant, since that is passed to hg-fast-export, which the conversion has not gotten as far as actually running yet - the failure looks to be in my code for making anonymous heads/bookmarks have unique branch names.

juan88 commented 4 years ago

This is the output of the command.

Initialized empty Git repository in /tmp/flydreamers-8a544011c8296d6255b468b538bb70ab/.git/
{'bookmark': None, 'timestamp': 1569014524, 'topological': True, 'hash': 'f607d22545566e88775fed4abb00a37dfdc3d643', 'branch': 'travel-phase-1-1'}
abort: uncommitted changes
(commit or update --clean to discard changes)
Traceback (most recent call last):
  File "exporter.py", line 256, in <module>
    main()
  File "exporter.py", line 252, in main
    BASH
  File "exporter.py", line 176, in process_repo
    amended_commits = fix_branches(hg_repo_copy)
  File "exporter.py", line 121, in fix_branches
    subprocess.check_call(['hg', 'up', head['hash']], cwd=hg_repo)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['hg', 'up', 'f607d22545566e88775fed4abb00a37dfdc3d643']' returned non-zero exit status 255

That branch that is mentioned travel-phase-1-1 is an old branch closed a few months ago. I have placed myself there and there no uncommitted changes.

chrisjbillington commented 4 years ago

Since it is the first time hg-export-tool tries to call hg up, this makes me think any uncommited changes are not created by hg-export-tool's branch fixing code.

hg-export-tool first creates a copy of the repository in a temporary directory before messing with it, the copy is made with shutil.copytree(). I am wondering if perhaps this copy function is e.g. not preserving file attributes or something like that, such that the copy of the repository appears as having uncommitted changes.

If so, adding --clean to the hg up command that crashes would be harmless.

But the cleaner fix would be to make the copy of the hg repo using hg clone instead of a filesystem copy:

diff --git a/exporter.py b/exporter.py
index 50af2b3..19ea96b 100644
--- a/exporter.py
+++ b/exporter.py
@@ -45,7 +45,7 @@ def copy_hg_repo(hg_repo):
     hg_repo_copy = os.path.join(
         gettempdir(), os.path.basename(hg_repo) + '-' + random_hex
     )
-    shutil.copytree(hg_repo, hg_repo_copy)
+    subprocess.check_call(['hg', 'clone', hg_repo, hg_repo_copy])
     return hg_repo_copy

 def get_heads(hg_repo):

(this is untested)

juan88 commented 4 years ago

This is the output of the program. It mentions the default branch but it is no longer used as the main branch.

Initialized empty Git repository in /tmp/flydreamers-d70f30bb0f0556a726399d04809b594c/.git/
no changes found
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Traceback (most recent call last):
  File "exporter.py", line 256, in <module>
    main()
  File "exporter.py", line 252, in main
    BASH
  File "exporter.py", line 176, in process_repo
    amended_commits = fix_branches(hg_repo_copy)
  File "exporter.py", line 93, in fix_branches
    all_heads = get_heads(hg_repo)
  File "exporter.py", line 71, in get_heads
    output = subprocess.check_output(cmd, cwd=hg_repo)
  File "/usr/lib/python3.5/subprocess.py", line 626, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['hg', 'heads', '--closed', '--template', 'json']' returned non-zero exit status 1
chrisjbillington commented 4 years ago

I will have to test later to work out why the hg clone does not seem to work.

However, I am fairly convinced the problem is with the copying of the repository, and if so, the --clean argument to hg up will also resolve the issue (it will just ignore whatever changes there supposedly are due to the incorrect copying, and update cleanly to the other branches)

So although I can't guarantee this will result in a correct conversion, if you are in a hurry I would recommend reverting the change to use hg clone, and just ignoring uncommitted changes when doing the update:

diff --git a/exporter.py b/exporter.py
index 50af2b3..7197e89 100644
--- a/exporter.py
+++ b/exporter.py
@@ -117,7 +117,7 @@ def fix_branches(hg_repo):
             else:
                 new_branch_name = branch + '-%d' % next(counter)
             # Amend the head to modify its branch name:
-            subprocess.check_call(['hg', 'up', head['hash']], cwd=hg_repo)
+            subprocess.check_call(['hg', 'up', '--clean', head['hash']], cwd=hg_repo)
             # Commit must be in draft phase to be able to amend it:
             subprocess.check_call(
                 ['hg', 'phase', '--draft', '--force', head['hash']], cwd=hg_repo

I will have more time later to look into why the hg clone method doesn't work.

juan88 commented 4 years ago

I think that now is able to clone the repo but another error comes up, again with a branch already mentioned. I'm running this command against a fresh copy of the repo not the one I was using in the day to day development. I really appreciate your help in this matter. I'm copying the output of the command below.

Initialized empty Git repository in /tmp/flydreamers-ec085d0fb50be602c1fe93a936e772d9/.git/
updating to branch default                                                                                           
24602 files updated, 0 files merged, 0 files removed, 0 files unresolved                                             
6626 files updated, 0 files merged, 14375 files removed, 0 files unresolved                                          
marked working directory as branch TIT-791-1
saved backup bundle to /tmp/flydreamers-aced01cbff7f770365d65e9101f3f2ef/.hg/strip-backup/2c07678a84bc-89c27a7d-amend.hg
517 files updated, 0 files merged, 257 files removed, 0 files unresolved
abort: a branch of the same name already exists
(use 'hg update' to switch to it)
Traceback (most recent call last):
  File "exporter.py", line 257, in <module>
    main()
  File "exporter.py", line 253, in main
    BASH
  File "exporter.py", line 177, in process_repo
    amended_commits = fix_branches(hg_repo_copy)
  File "exporter.py", line 127, in fix_branches
    subprocess.check_call(['hg', 'branch', new_branch_name], cwd=hg_repo)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['hg', 'branch', 'travel-phase-1-1']' returned non-zero exit status 255
chrisjbillington commented 4 years ago

Ah, this looks like a unique issue where my code is appending integers to branch names in order to make them unique, but you happen to already have a branch name the same as the result of doing this.

You can modify the code to add e.g. renamed to the new branch name to avoid the collision in your case:

--- a/exporter.py
+++ b/exporter.py
@@ -115,9 +115,9 @@ def fix_branches(hg_repo):
             if head['bookmark'] is not None:
                 new_branch_name = head['bookmark']
             else:
-                new_branch_name = branch + '-%d' % next(counter)
+                new_branch_name = branch + '-renamed-%d' % next(counter)
             # Amend the head to modify its branch name:

Ideally the tool would provide command line options for this sort of thing, but the above should work in your case!

(if you don't like the resulting branch names, it is trivial to rename git branches once you have a git repository since they are more like hg bookmarks and not embedded in commit data)

juan88 commented 4 years ago

I'll try that as well. I tried running fast-export alone and with a dockerised image in order to avoid problems with dependencies. And it seems that there is a branch with an unnamed head or something like that. For what I see in fast-export issues, there is problems with that. It seems that there were two branches that were closed independently some months ago and that are having issues with that.

I tried removing one of the close heads and merging the other even though I don't need them anymore to see if that also improves. I want to let you know that as well. I'll try the fix you propose if it helps.

chrisjbillington commented 4 years ago

If you don't use bookmarks and you merge heads together such that there are no unmerged heads with the same branch names as each other, then there is no real reason to use hg-export-tool instead of hg-fast-export by itself. The only purpose of hg-export-tool is to handle multiple heads with the same branch name, and bookmarks.

juan88 commented 4 years ago

I could not run fast-export when I first tried it. I was not using bookmarks or something like that in that repo that I know of. That's why I found this tool.

I finally could run hgfast-export on this particular problematic repository (the one that originated this bug) but under a docker image. It seems that the bug I raised in the other repo #233 happens in my local environment but not in that docker image. But what is more, is that the problematic repo I mentioned, had something odd that happened twice due to a bad merge some time ago: two collegues of mine, independently closed the same branch. If you look at the node graph, it seems that there are two heads (excuse me if they are not technically called heads) for the same branch since both of them closed. You would expect that one of them performs the merge of the working state and then close the branch. Well that's not what happened.

In this scenario, both tools were unable to solve that. So after I figured it out that the situation could be problematic I tried merging them together and then closing that branch. This seems to have let the fast-export continue running and migrating the repo.

I'm not sure if that is the root cause of the problem but at least I could migrate that repo.

Oxkdjsl commented 1 year ago

I could really use some help! bitbucket will deprecate my hg repo at the end of the day :(