NavicoOS / ac2git

Tool to convert an AccuRev repository to Git.
29 stars 15 forks source link

Overview

ac2git is a tool to convert an Accurev depot into a git repo. All specified Accurev streams will be the target of the conversion, and an attempt is made to map the Accurev stream model to a Git branching model. There are fundemental differences between the two that can make the converted repo history look strange at times but we've done our best to maintain correctness over beauty.

Getting started

Note: It is recommented that you run the conversion on a Linux machine if your Accurev depot contains symbolic links. Additionally the converted repo is going to have more correct file permissions if it is run on a Linux machine.

How to use

Converting a Depot's Streams

The result

What this script will spit out is a git repository with independent orphaned branches representing your streams. Meaning, that each stream is converted separately on a branch that has no merge points with any other branch. This is by design as it was a simpler model to begin with.

Each git branch accurately depicts the stream from which it was created w.r.t. time. This means that at each point in time the git branch represents the state of your stream. Not only are the transactions for this stream commited to git but so are any transactions that occurred in the parent stream which automatically flowed down to us. When combined with my statement from the previous paragraph, this implies that you will see a number of commits on different branches with the same time, author and commit message, most often because they represent the same promote transaction.

Ideally, if you have promoted all of your changes to the parent stream this should be identified as a merge commit and recorded as such. Though it would now be possible to extend this script to do so, it is not on my radar for now as it would be a reasonably large undertaking. However, there is hope because I've implemented an experimental feature, described below, that does just that but it operates as a post processing step. It is still a little buggy and requires iteration but it proves the concept. Patches are welcomed!

Files that break history

If you have a legacy repository it is possible that you may have some files that break history. One typical example is a version file that has different coppies accross different streams and is never promoted. This will affect the ability of the algorithm to infer merge points between branches. Hence, it would be great if we could ignore these files when determining merge points.

This is possible to do but it is not a part of the script's functionality. It is a function of git that you can specify your own diff driver (see gitattributes) for particular files. This answer to the StackOverflow question titled Want to exclude file from "git diff" suggests the same. It might also pay to take a look at this answer on a similarly titled question from StackOverflow (Excluding files from git-diff).

Ignoring the whole file should be easy but ignoring only a small part of it will require you to write a script that does it. Bash or Python, it will be custom in each situation so I can't really cater for it in this script which is why this note is here.

Note: I recommend using the .git/info/attributes file and not making a .gitattributes file in the main repo since it may be deleted by the script or overwritten if it was ever promoted in Accurev.

Example for Linux (from this stackoverflow answer):

Add the following to your .git/config file:

[diff "nodiff"]
    command = /bin/true

Add something like the following to the .git/info/attributes file of the conversion repository:

folder/bad_file.c diff=nodiff

On Windows you might need to find where the command true lives but it should be included with Git.

Tips & tricks

The two stages

The script has 2 stages. The first stage downloads all of the information out of accurev into Git, while the second stage processes it all and produces Git branches. Stage 1 uses Accurev commands while Stage 2 executes solely Git commands.

Stage 1 stores everything under the refs/ac2git/depots/ which is hidden from the normal user of the Git repository and isn't cloned/fetched by default. After we have this information we no longer need to access the Accurev server in order to produce Git branches with or without merges.

Stage 2 stores it's information about the produced branches and its state under the refs/ac2git/state/ primarily and stores some useful lookup tables under refs/ac2git/cache/ (i.e. a lookup table from stream names to stream numbers). This stage doesn't execute any Accurev commands and uses only Git commands that operate on the refs/ac2git/depots/ which was produced in Stage 1.

Quick re-conversion

If you're trying out different configuration options there shouldn't be a need for you to redownload everything from Accurev every time (for some options you might, but most should be ok).

If you execute git show-ref in your converted repository you will see a lot of references starting with refs/ac2git/. These have 3 categories:

refs/ac2git/depots/...    <-- These store Accurev data & metadata for each stream (stage 1)
refs/ac2git/cache/...     <-- These store lookup tables for converting stream names to numbers (stage 2 -start)
refs/ac2git/state/...     <-- These store past positions of all branches. (stage 2 -main)

If you delete all of the refs starting with refs/ac2git/state/ and re-run the script you will save yourself a lot of time by not re-downloading accurev data & metadata.

To simply delete a single ref you would run the git update-ref -d <ref> command. So, to delete all of your "stage 2" refs, run the following command:

for REF in $(git show-ref | grep refs/ac2git/state | cut -d' ' -f2); do git update-ref -d $REF; echo "deleted $REF"; done

After which you should delete all the branches and restart the script.

Tested with

master branch

Known compatibility issues

Note: It may be possible to convert an Accurev 4.7 depot by creating a single workspace that starts at transaction 1 and updating the workspace to every transaction up to highest, commiting into git if there are any differences. This approach would be easier to implement in Ryan's original script. See issue 11 on parsley72's accurev2git repo.

Version 0.2 and earlier were tested with


Credits

This tool was inspired by the work done by Ryan LaNeve in his https://github.com/rlaneve/accurev2git repository and the desire to improve it. Since this script is sufficiently different I have placed it in a separate repository here. I must also thank Tom Isaacson for his contribusion to the discussions about the tool and how it could be improved. It was his work that prompted me to start on this implementation. You can find his fork of the original repo here https://github.com/parsley72/accurev2git.

The algorithm used here was colaboratibely devised by Robert Smithson, whose stated goal is to rid the multiverse of Accurev since ridding just our verse is not good enough, and myself.

My work is in the implementation and the merging part of the algorithm all of which I humbly offer to anyone who doesn't want to remain stuck with Accurev.


Dear contributors

I am not a python developer which should be evident to anyone who's seen the code. A lot of it was written late at night and was meant to be just a brain dump, to be cleaned up at a later date, but it remained. Please don't be dissuaded from contributing and helping me improve it because it will get us all closer to ditching Accurev! I will do my best to add some notes about my method and how the code works in the sections that follow so please read them.

I strongly recommend reading the how_it_works.md for a word explanation of what the algorithm is meant to do and the hacking_guide.md for more information on the file structure and interesting functions.

For now it works as I need it to and that's enough.


Other works

There is one more conversion script called accurev-to-git, which was developed independently to this one, by Serban Constantin. He wrote this blog post about it, which explains his methodology.



Copyright (c) 2015 Lazar Sumar

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.