CaveSurveying / CUCCexposurveyissues

Experimental issue tracker for distributing and survey work
4 stars 1 forks source link

Migrate the loser and tunneldata repositories to Git from Mercurial #16

Open goatchurchprime opened 6 years ago

goatchurchprime commented 6 years ago

I think Mercurial is so unpopular now that we probably should convert over. I keep typing the wrong commands, because everything else I do is in Git.

I think it can be done preserving all the changes and using all the same ssh protocols. https://stackoverflow.com/questions/16037787/convert-mercurial-project-to-git

For students, knowing Git is going to be a more useful life skill at this point than ever hearing of Mercurial, so there will be some payback for this pain.

PhilipSargent commented 6 years ago

Possibly, just possibly, could we leave this discussion until after the imminent expo? Please?

goatchurchprime commented 6 years ago

If this argument gets carried, it affects how much effort is put into education on how to run Mercurial, vs doing the minimum to get us through using current systems for the remainder of the summer.

PhilipSargent commented 6 years ago

As I use Tortoise anyway, and I would use TortoiseGit instead of TortoiseHg after such a change, and since I only attempt anything complex than pull, push, update when things go wrong…. I doubt that I (or most people in my position) would see much difference.

From: Julian Todd [mailto:notifications@github.com] Sent: 17 June 2018 15:58 To: CaveSurveyGIS/CUCCexposurveyissues Cc: Philip Sargent; Comment Subject: Re: [CaveSurveyGIS/CUCCexposurveyissues] Migrate the loser and tunneldata repositories to Git from Mercurial (#16)

If this argument gets carried, it affects how much effort is put into education on how to run Mercurial, vs doing the minimum to get us through using current systems for the remainder of the summer.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CaveSurveyGIS/CUCCexposurveyissues/issues/16#issuecomment-397884526 , or mute the thread https://github.com/notifications/unsubscribe-auth/AFXQTdB_hulEPAnDlvZ-5vEJdkDfT3iPks5t9m6UgaJpZM4Uqymh . https://github.com/notifications/beacon/AFXQTV4smT_tvkw_mpH6_x86GNfij5QXks5t9m6UgaJpZM4Uqymh.gif

ojwb commented 6 years ago

I'd be very happy to not have to deal with mercurial, but please can we try to restore the missing pre-hg history of the loser repository when we build the new git repo's history. It's very frustrating to try to track down the history of something only to see that the interesting stuff is prior to the start of the history that's easily available. This has happened to me twice so far this week for the loser hg repo, so it's not just a theoretical problem.

Because each git sha commit hash is calculated over data which includes the parent commit hash we can't insert the older history later without changing all the commit hashes in the repo, which is disruptive to say the least (this is a deliberate feature - it prevents an attacker tampering with the history). So doing this at the time of the conversion to git would be the best option.

We'd need to track down a backup of the SVN repo. I probably have one somewhere, but it might not be the very latest. Restoring the history with a gap would still be an improvement, but if we can find one which covers up to at least svn r5127 (in 2003) that'd be good (it looks like hg and svn ran in parallel from then until svn r8493 in 2009, though not every svn revision number in between has a commit in hg so perhaps some of the hg commits from that time include changes from multiple svn revisions?)

I'm happy to do the extra work to get us a decent conversion.

BeckaLawson commented 6 years ago

I think I may have some SVN backups at home but I'm away until Monday (though if anyone else more IT-savvy has one that would be easier). Thanks for the offer to do it, Olly :-)

mshinwell commented 6 years ago

I'm happy to try to do this as well. I have some recent experience with another hg to git conversion which may be of use. I will also check to see if I have a backup of the old svn.

ojwb commented 6 years ago

So far I've found some backups of the CVS repo (so before SVN even) from the early 2000s.

I think I may have some SVN backups at home

Great, though note that a backup of an SVN checkout is much less useful, as it only gives us a single version (unlike newer systems like hg and git, checking out the code from SVN doesn't give you the full history locally, only a single version at a time).

So ideally we're after a backup of the SVN repo (which you might well have - before we could push changes to the internet from expo easily we tended to burn a few copies onto CD-R at the end of expo and send them home in different cars - that's where the CVS repo backups I found so far are from).

But backups of SVN checkouts may still be useful if they fall into a gap we have in the history. The CVS repo backups I have mean we should at least have a complete history from when we started to use version control until the early 2000s, and then a complete history after hg became the master VCS (August 2009). So basically anything from 200x is potentially useful.

I have some recent experience with another hg to git conversion which may be of use

That may well be useful - I've converted CVS to SVN and SVN to git, but not hg to git before.

If people find useful backups, please send me a copy and I can try to assemble as complete a coverage of history as I can, then we can look at actually doing the conversion between this expo and next.

PhilipSargent commented 6 years ago

I have been consolidating the Tortoise/Hg sections of the Expo Handbook into one place http://expo.survex.com/handbook/update.htm , and removing links to Tortoise/Hg except for that place.

This will make it easier to change the handbook to reflect reality as this conversion proceeds.

Generally I have also changed explicit mentions of “mercurial” to “version control system” except in the “experts manual” bit.

To do:

Extract the simpler things like uploading photos and typing in logbooks from that page to dedicated “How do I…” pages.

btw I will be driving slowly from Basel to Staudnwirt from tomorrow during the coming week (campervan) so I won’t be able to do as much of this as I would like.

Philip

From: Olly Betts [mailto:notifications@github.com] Sent: 28 June 2018 01:50 To: CaveSurveyGIS/CUCCexposurveyissues Cc: Philip Sargent; Comment Subject: Re: [CaveSurveyGIS/CUCCexposurveyissues] Migrate the loser and tunneldata repositories to Git from Mercurial (#16)

I'd be very happy to not have to deal with mercurial, but please can we try to restore the missing pre-hg history of the loser repository when we build the new git repo's history. It's very frustrating to try to track down the history of something only to see that the interesting stuff is prior to the start of the history that's easily available. This has happened to me twice so far this week for the loser hg repo, so it's not just a theoretical problem.

Because each git sha commit hash is calculated over data which includes the parent commit hash we can't insert the older history later without changing all the commit hashes in the repo, which is disruptive to say the least (this is a deliberate feature - it prevents an attacker tampering with the history). So doing this at the time of the conversion to git would be the best option.

We'd need to track down a backup of the SVN repo. I probably have one somewhere, but it might not be the very latest. Restoring the history with a gap would still be an improvement, but if we can find one which covers up to at least svn r5127 (in 2003) that'd be good (it looks like hg and svn ran in parallel from then until svn r8493 in 2009, though not every svn revision number in between has a commit in hg so perhaps some of the hg commits from that time include changes from multiple svn revisions?)

I'm happy to do the extra work to get us a decent conversion.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CaveSurveyGIS/CUCCexposurveyissues/issues/16#issuecomment-400875943 , or mute the thread https://github.com/notifications/unsubscribe-auth/AFXQTd0vwtIqQLBLsernPJDHKlU2Q8VNks5uBChOgaJpZM4Uqymh . https://github.com/notifications/beacon/AFXQTW8I6d5WJL9x2B3melVk2WHuKpPdks5uBChOgaJpZM4Uqymh.gif

goatchurchprime commented 6 years ago

This could be the answer for handling the scans: https://git-lfs.github.com/

wobrotson commented 6 years ago

Not sure about that, you may have to explain all this to me at Hidden Earth...

On Fri, 7 Sep 2018, 22:33 Julian Todd, notifications@github.com wrote:

This could be the answer for handling the scans: https://git-lfs.github.com/

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CaveSurveyGIS/CUCCexposurveyissues/issues/16#issuecomment-419571706, or mute the thread https://github.com/notifications/unsubscribe-auth/Amc4Zxt80s1ywLKbpra_ie9UweP2x9fLks5uYuYvgaJpZM4Uqymh .

wookey commented 5 years ago

troggle has been migrated to git, and the old erebus and cvs branches (pre 2010) removed. Some decrufting was done to get rid of log files, old copies of embedded javascript (codemirror, jquery etc) and some fat images no longer used.

tunneldata has also been migrated to git, and renamed 'drawings' as it includes therion data too these days.

The loser repo and expoweb repo need more care in migration. Loser should have the old 1999-2004 CVS history restored, and maybe toms annual snapshots from before that, so ancient histoary can usefully be researched (sometimes useful). It's also a good idea to add the 2015, 2016 and 2017 ARGE data we got (in 2017) added in the correct years so that it's possible to go back to an 'end of this year' checkout and get an accurate view of what was found (for making plots and length stats). All of that requires some history rewriting, which is best done at the time of conversion.

Similarly expoweb is full of bloat from fat images and surveys and one 82MB thesis that got checked in and then removed. Clearing that out is a good idea. I have a set of 'unused fat blob' lists which can be stripped out with git-gilter. It's not hard to make a 'do the conversion' script, ready for sometime after expo 2019 has calmed down.