NavicoOS / ac2git

Tool to convert an AccuRev repository to Git.
29 stars 15 forks source link

[Problem with script a.k.a. probably I did something wrong] Git state is broken #99

Closed tommix1987 closed 7 years ago

tommix1987 commented 7 years ago

Intro

Let me start from begining: I'm absolutely amazed by the amount of good work that was done on the script! Good job guys! Anyway let's get to the point:

Issue 1

I'm running the script on reasonably big repo/long history (+1.5 years). While the script is resolving one of the streams I get below error message:

Missmatch while retrieving stream XXX_XXX (id: streamId), the state ref (
refs/ac2git/depots/1476/streams/83/info) is on tr. 29858 but the data ref (refs/
ac2git/depots/1476/streams/83/data) wasn't retrieved. 

I believe that the state ref is wrong (it should be on transaction around 15k not 29k which is basically the last transaction).

Important info probably that this happened after some random failure in the script's commit (which was caused by a nested git repo (.git folder) inside accurev repo - which I've managed to fix by now.

Is there any way I can tell it to rewind itself to the proper transaction?

Issue 2

Second issue I have is that on another Stream in the Depot I get the following error:

XXX_XXX: next transaction 23192 (end tr. 29867)
Running time was  0:05:42.02
The script has encountered an exception, aborting!
Traceback (most recent call last):
  File "ac2git.py", line 3968, in AccuRev2GitMain
    rv = state.Start(isRestart=args.restart, isSoftRestart=args.softRestart)
  File "ac2git.py", line 3205, in Start
    self.RetrieveStreams()
  File "ac2git.py", line 1615, in RetrieveStreams
    tr, commitHash = self.RetrieveStream(depot=depot, stream=streamInfo, dataRef
=dataRef, stateRef=stateRef, hwmRef=hwmRef, startTransaction=self.config.accurev
.startTransaction, endTransaction=endTr.id)
  File "ac2git.py", line 1560, in RetrieveStream
    stateTr, stateHash = self.RetrieveStreamInfo(depot=depot, stream=stream, sta
teRef=stateRef, startTransaction=startTransaction, endTransaction=endTransaction
)
  File "ac2git.py", line 1279, in RetrieveStreamInfo
    self.SafeCheckout(ref=stateRef, doReset=True, doClean=True)
  File "ac2git.py", line 656, in SafeCheckout
    logger.debug( "Reset current branch - '{br}'".format(br=status.branch) )
AttributeError: 'NoneType' object has no attribute 'branch'
Traceback (most recent call last):
  File "ac2git.py", line 3988, in <module>
    AccuRev2GitMain(sys.argv)
  File "ac2git.py", line 3968, in AccuRev2GitMain
    rv = state.Start(isRestart=args.restart, isSoftRestart=args.softRestart)
  File "ac2git.py", line 3205, in Start
    self.RetrieveStreams()
  File "ac2git.py", line 1615, in RetrieveStreams
    tr, commitHash = self.RetrieveStream(depot=depot, stream=streamInfo, dataRef
=dataRef, stateRef=stateRef, hwmRef=hwmRef, startTransaction=self.config.accurev
.startTransaction, endTransaction=endTr.id)
  File "ac2git.py", line 1560, in RetrieveStream
    stateTr, stateHash = self.RetrieveStreamInfo(depot=depot, stream=stream, sta
teRef=stateRef, startTransaction=startTransaction, endTransaction=endTransaction
)
  File "ac2git.py", line 1279, in RetrieveStreamInfo
    self.SafeCheckout(ref=stateRef, doReset=True, doClean=True)
  File "ac2git.py", line 656, in SafeCheckout
    logger.debug( "Reset current branch - '{br}'".format(br=status.branch) )
AttributeError: 'NoneType' object has no attribute 'branch'

Any ideas?

fatfreddie commented 7 years ago

I'll try and find some time this evening to take a look. Hopefully @orao will get a chance to comment, too.

tommix1987 commented 7 years ago

Thanks in advance. I'm desperate to get out of Accurev as soon as I can, so any help will be really much appreciated.

orao commented 7 years ago

Issue 1

The code that print that error can be found here. This is inside the RetrieveStream() function which simply invokes two sequential operations: retrieval of the Accurev metadata (RetrieveStreamInfo() call) first, and then retrieval of the Accurev stream contents (RetrieveStreamData() call) second. The line that is printing the error is checking that both of these operations have completed successfully. It seems that the retrieval of the stream contents is failing for some reason that isn't clear from the information you've provided. It would help if you could put some debug print statements in the RetrieveStreamData() function and tell us why/where it is returning None? Even if you're unsure, letting us know the rough code path that the code is taking would help a lot.

Only transactions that have a non-empty accurev diff, for your specific stream, which is returned by accurev diff -a -i -v stream_name -V stream_name are recorded. The last transaction returned by accurev hist is not necessarily the last transaction that changed the contents of your stream so the script will continue to retrieve all parent stream transactions, see deep-hist explanation.

To check what transactions the script thinks changed your stream use:

python3 accurev.py deep-hist -p <my-depot> -s <my-stream> -t 1-highest

However, if you still think that the info ref is incorrect you can inspect it with:

git log refs/ac2git/depots/1476/streams/83/info

There are 3 files of interest that are saved in the tree that the info ref tracks:

  1. hist.xml
  2. streams.xml, and
  3. diff.xml (not present for mkstream transactions)

To view their contents you can use:

git show refs/ac2git/depots/1476/streams/83/info:hist.xml
git show refs/ac2git/depots/1476/streams/83/info:streams.xml
git show refs/ac2git/depots/1476/streams/83/info:diff.xml

It is unfortunate that you had a .git repository in there. We identified this as a potential problem early in the development but never went back to fix it (relates to issue #1). The solution we considered was to use separate .git and work directories. If you have a solution for this problem a pull request would be welcome.

Note 1: You've done a good job redacting sensitive information but the stream ID is encoded in the ref name, alongside the depot ID. Use refs/ac2git/depots/<depot-id>/streams/<stream-id>/info if you're concerned about stream numbers, but personally numerical ID's should be ok since they don't reveal anything useful.

Note 2: Some of the transaction numbers that the script prints for a stream are a "high water mark". This is not the case here but for future reference the "high water mark" is the last transaction up to which we have checked our stream for changes, not the last transaction that changed the stream itself.

orao commented 7 years ago

Issue 2

The error you have is on this line. This means that the git status command failed for some reason. If possible would you mind adding some debug output in this code (see _docmd())?

Adding the following:

output = self._docmd(cmd)
print("cmd: {0}".format(cmd))
print("stdout: {0}".format(self.lastStdout))
print("stderr: {0}".format(self.lastStderr))
print("returncode: {0}".format(self.lastReturnCode))

on this line, is one example.

fatfreddie commented 7 years ago

Great answers @orao, I couldn't have done as well.

ghost commented 7 years ago

Thanks @fatfreddie!

@tommix1987 if you need any further clarification or if you have more questions we'd be glad to help.

tommix1987 commented 7 years ago

Hello gentlemen, thanks for very quick response. I will try and organize the details as soon as I can, I've currently switched the order of the streams in config file, so the script is now busy sorting out other streams that need to be migrated which are not causing any issues (had a random network failure in the meantime which delayed everything unfortunately). I guess I'll have to wait another day or two before giving you all the details needed. Anyway, thank you again for very quick response. I almost feel like I'm on a paid support plan here, since you guys respond almost immediately.

ghost commented 7 years ago

I have just noticed that the error message format string is wrong. The streamId string should be {streamId} and be substituted with the actual stream id. Fixed in commit 148fdda.

ghost commented 7 years ago

Looking into Issue 1 a little more carefully, and seeing where RetreiveStreamData() could return None -- assuming that the provided error messages was the only error printed there are a couple possibilities to consider:

Non-verbose output:

If you didn't run the script with the -v option then all logger.debug() calls would not print so there are only two possible return paths from RetreiveStreamData() that we can take here and get only the provided error message printed. These return paths are taken when the script fails to make a commit on the new ref, line #1440 and/or line 1444.

All other lines have at least one logger.info() or logger.error() call meaning that it should print something on stdout or stderr respectively, indicating an issue or a stage of processing before printing the error.

I should possibly change line #1440 and line 1444 to print the messages using logger.error() instead of logger.debug().

ghost commented 7 years ago

Changed logger.debug() use to logger.error() for the two possible locations that RetrieveStreamData() could return None in commit b732045.

ghost commented 7 years ago

@tommix1987 please update your script by pulling the latest before investigating issue 1 further. It should help a little.

ghost commented 7 years ago

Issue 2 has been partially addressed in commit 80310869, by making the script print a more useful error message. It still doesn't resolve the issue but should help in finding the root cause.

I suspect that Issue 2 is likely related to your accurev stream/depot having had a .git repository promoted into it.

ghost commented 7 years ago

Adding enhancemen label since I've added a few more debug options/improvements as a followup to the lack of information.

orao commented 7 years ago

@tommix1987 this issue seems to have gone stale, have you made any progress on the issue?

orao commented 7 years ago

This issue has not been confirmed and has gone stale. Closing due to inactivity.

@tommix1987 feel free to reopen if you manage to gather more information.