sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.29k forks source link

Write external import script for CVS repositories #23757

Closed LawnGnome closed 3 years ago

LawnGnome commented 3 years ago

Based on the excellent work of @christinelovett and @abeatrix, we need to write a script that allows CVS repositories to be imported into Sourcegraph.

Requirements

Prior art

There are three tools/workflows I know of that allow for CVS-to-Git conversion:

Preferred method

I believe there's a path to implementing a tool with the desired properties. The key is that we can use git-fast-import to perform a full import with only one pass over the CVSROOT:

  1. Add all file revisions as blobs, not bothering to note if we actually need them or not.
  2. Simultaneously build a map of file commits in an ordered map (author, commit) => (file, time, revision), with the order taken from the RCS revision IDs.
  3. Split map values into buckets based on the closeness of their commit times, as cvsps does.
  4. Retrieve patchsets in order with author, commit, files; tracking the previous patchset as the parent commit, we can use filedeleteall when constructing git-fast-import commits to allow Git to figure out history.

If we store and retrieve marks and inferred patchsets, we can also do this incrementally.

Milestones

To be turned into issues:

LawnGnome commented 3 years ago

Closing in favour of the newly split issues, which should all be linked above.