AndreyNikiforov / vault2git

Converts vault repository to git
52 stars 51 forks source link

improved error handling #10

Open ormico opened 11 years ago

ormico commented 11 years ago

check for no labels pause between vault operations retry vault operations on error

robe070 commented 11 years ago

I have done some work on trying to make it run faster. I have come to the conclusion that the Vault API does not have a 'get changeset' command. It only gets a Version. When it gets a Version it copies the current state of every file into the WorkingFolder. As your repository grows it gets slower and slower. I started with 0.7 secs and by Version 4600 it was > 13 secs.

So why this change has decided it impacts the server baffles me. In my case it definitely does not. It only goes there to get the changed files. The rest comes from the client-side cache.

ormico commented 11 years ago

On my server, and this may be different for other people, on large repositories I'll see vault return errors a handful of times. If I run it in debug in visual studio I've found that I can just move the execution pointer back and rerun the same command and it always succeeds. When I add this tiny wait in between commands I don't get any errors.

robe070 commented 11 years ago

I think you should be running Vault2Git on a separate machine to your Vault Server. 99% of its activity is with the client-side cache. It hardly goes to the server at all - just to get the changes between versions. The rest is copying cache to the working directory which is horrifically slow - my 22,000 files take about 22 seconds right now.

I rarely get exceptions, and when I do your retry code does the trick. No need for the delay between every get.

ormico commented 11 years ago

I never run vault2git on the server.

ormico commented 11 years ago

It doesn't matter to me. I only have 1 repo left to convert and it is low priority. I do get errors frequently enough for it to be a hassle which is why I made these changes. With the two changes I submitted, can leave a large repo converting and go do something else and be assured that when I come back hours later it will have finished successfully and not have errored or hung. It might work just as well if you left the Try/Catch logic in and took out the 0.2 sec wait after each GET but personally I would rather it take slightly longer and be assured it would work and not have to re-do it.

But that is just me. I'm almost done with this task and just wanted to submit this in case it was helpful to someone else.

AndreyNikiforov commented 11 years ago

As far as I remember, Vault API are not backward compatible. E.g. v6 will not connect v4 server. This was the reason I wrote against server I was using (v4) and did not use latest version of DLLs available at that time. This fact also prevented vault2git being distributed as binary utility (in addition to source code): everybody has to get source code, update vault dlls to the version they use, recompile, and prey the API did not change...

I see benefit of keeping source code at the [original] version tested and, possibly, introduce separate branches for other versions of vault dlls.

robe070 commented 11 years ago

@ormico I'm commenting on the Pull Request being put in the main repo.

You have a comment that the code you introduced was to reduce impact on the Server. But as far as I can see thats not the case, unless you run it on the server.

I completely agree with your code change to retry after an exception. But given the rarity of the exception I do not think it wise to delay 0.2 seconds between every Version. For 21,000 Versions thats an extra 1 hour and 15 minutes. Catching the exception is all that is required. Also, for small repositories 0.2 seconds will double the time to do the conversion.

I have included the catching of the exception in my Pull Request, but not the 0.2s delay and not the label improvement which I have not looked at.

robe070 commented 11 years ago

@AndreyNikiforov Vault API are not compatible any which way. You must match client to server.

So I agree with keeping separate branches for other versions of vault DLLs.

That implies that all Pull Requests to Master should not include the DLLs.

I see benefit of keeping source code at the [original] version tested

I presume that comment is just in relation to the DLLs, not the rest of the code?

atiq-cs commented 6 years ago

Is it not approved yet because of inclusion of dll files in the commit?