bozaro / git-lfs-migrate

Simple project for convert old repository for using git-lfs feature
MIT License
222 stars 29 forks source link

Not working with VSO #38

Open hcoona opened 7 years ago

hcoona commented 7 years ago

Hi there,

It seems that this tool cannot work with VSO hosted Git repos.

The error message are shown below:

[main] INFO git.lfs.migrate.Main -   processed: 4647/4647
Exception in thread "main" java.util.concurrent.ExecutionException: org.apache.http.client.ClientProtocolException
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at git.lfs.migrate.Main$HttpUploader.close(Main.java:359)
        at git.lfs.migrate.Main.processRepository(Main.java:170)
        at git.lfs.migrate.Main.main(Main.java:84)
Caused by: org.apache.http.client.ClientProtocolException
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at ru.bozaro.gitlfs.client.internal.HttpClientExecutor.executeMethod(HttpClientExecutor.java:26)
        at ru.bozaro.gitlfs.client.Client.doRequest(Client.java:275)
        at ru.bozaro.gitlfs.client.Client.putObject(Client.java:238)
        at ru.bozaro.gitlfs.client.BatchUploader.lambda$objectTask$13(BatchUploader.java:66)
        at ru.bozaro.gitlfs.client.internal.BatchWorker.processObject(BatchWorker.java:262)
        at ru.bozaro.gitlfs.client.internal.BatchWorker.lambda$submitTask$3(BatchWorker.java:223)
        at ru.bozaro.gitlfs.client.internal.BatchWorker.executeInPool(BatchWorker.java:298)
        at ru.bozaro.gitlfs.client.internal.BatchWorker.access$400(BatchWorker.java:34)
        at ru.bozaro.gitlfs.client.internal.BatchWorker$1.run(BatchWorker.java:317)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.http.ProtocolException: Transfer-encoding header already present
        at org.apache.http.protocol.RequestContent.process(RequestContent.java:93)
        at org.apache.http.protocol.ImmutableHttpProcessor.process(ImmutableHttpProcessor.java:132)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:182)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        ... 15 more

I tried to ignore this error. After that, I cannot checkout from the converted repo. Error message shown below:

$ git checkout -f
Downloading SMLPuller/lib/bond-3.0.jar (117.25 KB)
Error downloading object: SMLPuller/lib/bond-3.0.jar (da67920233505056889f412e32bb3b91af5a16dfc513a1e37e3264948828681c)

Errors logged to /tmp/test-target/.git/lfs/objects/logs/20170323T163817.1322287.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: SMLPuller/lib/bond-3.0.jar: smudge filter lfs failed

The content of /tmp/test-target/.git/lfs/objects/logs/20170323T163817.1322287.log is shown below:

cat /tmp/test-target/.git/lfs/objects/logs/20170323T163817.1322287.log
git-lfs/2.0.1 (GitHub; linux amd64; go 1.8)
git version 2.11.0

$ git-lfs filter-process
Error downloading object: SMLPuller/lib/bond-3.0.jar (da67920233505056889f412e32bb3b91af5a16dfc513a1e37e3264948828681c)

Smudge error: Error downloading SMLPuller/lib/bond-3.0.jar (da67920233505056889f412e32bb3b91af5a16dfc513a1e37e3264948828681c): batch response: Post /tmp/target.git/info/lfs/objects/batch: unsupported protocol scheme ""
github.com/git-lfs/git-lfs/errors.newWrappedError
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/errors/types.go:166
github.com/git-lfs/git-lfs/errors.NewSmudgeError
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/errors/types.go:252
github.com/git-lfs/git-lfs/lfs.PointerSmudge
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/lfs/pointer_smudge.go:68
github.com/git-lfs/git-lfs/lfs.(*Pointer).Smudge
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/lfs/pointer.go:64
github.com/git-lfs/git-lfs/commands.smudge
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/commands/command_smudge.go:63
github.com/git-lfs/git-lfs/commands.filterCommand
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/commands/command_filter_process.go:65
github.com/git-lfs/git-lfs/vendor/github.com/spf13/cobra.(*Command).execute
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/vendor/github.com/spf13/cobra/command.go:477
github.com/git-lfs/git-lfs/vendor/github.com/spf13/cobra.(*Command).Execute
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/vendor/github.com/spf13/cobra/command.go:551
github.com/git-lfs/git-lfs/commands.Run
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/commands/run.go:68
main.main
        /tmp/docker_run/src/github.com/git-lfs/git-lfs/amd64/obj-x86_64-linux-gnu/src/github.com/git-lfs/git-lfs/git-lfs.go:35
runtime.main
        /usr/local/go/src/runtime/proc.go:185
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:2197

ENV:
LocalWorkingDir=/tmp/test-target
LocalGitDir=/tmp/test-target/.git
LocalGitStorageDir=/tmp/test-target/.git
LocalMediaDir=/tmp/test-target/.git/lfs/objects
LocalReferenceDir=
TempDir=/tmp/test-target/.git/lfs/tmp
ConcurrentTransfers=3
TusTransfers=false
BasicTransfersOnly=false
SkipDownloadErrors=false
FetchRecentAlways=false
FetchRecentRefsDays=7
FetchRecentCommitsDays=0
FetchRecentRefsIncludeRemotes=true
PruneOffsetDays=3
PruneVerifyRemoteAlways=false
PruneRemoteName=origin
AccessDownload=none
AccessUpload=none
DownloadTransfers=basic
UploadTransfers=basic
GIT_DIR=.git
GIT_PREFIX=
bozaro commented 7 years ago

What is VSO?

hcoona commented 7 years ago

https://www.visualstudio.com/vso/

hcoona commented 7 years ago

I convert the repository via following command:

$ git clone --mirror https://shuaiz:$PAT@msasg.visualstudio.com/DefaultCollection/_git/Multi
%20Tenancy source.git
$ java -jar /mnt/d/GitProjects/git-lfs-migrate/build/deploy/git-lfs-migrate.jar -s source.gi
t -d target.git -l https://shuaiz:$PAT@msasg.visualstudio.com/DefaultCollection/_git/Multi%20Tenancy.git/info/lfs "*.exe
" "*.dll" "*.pdb" "*.zip" "*.gzip" "*.jar"
hcoona commented 7 years ago

The repository is private and sorry for that I cannot grant you to access it. But I tried other repositories also meet this issue.

dampcake commented 7 years ago

I am also running into the same exception :(

zivillian commented 7 years ago

I had the same error with TFS2017 and wrote a small and dirty workaround.

The first problem is, that TFS returns a Transfer-Encoding header, which will cause the above error Transfer-encoding header already present. My second problem was, that the PUT requests missed the Authorization Header.

I haven't tested this againts VSO only against OnPrem TFS2017.

The commit can be found at zivillian/git-lfs-migrate@a1a1d5e8bf67818461cb71f0a2de2d0740d1399c

deAtog commented 7 years ago

This issue is not limited to VSO, and is an issue with ru.bozaro.gitlfs.client.Client. At line 274 of the source file for ru.bozaro.gitlfs.client.Client, in the doRequest method, addHeaders is called to add the appropriate headers for the request created on line 273. The subsequent call to http.executeMethod on line 275 to process the request fails because the "Transfer-encoding" header is being sent multiple times.

~~A possible solution to this issue may be to replace line 325 of ru.bozaro.gitlfs.client.Client with: req.setHeader(entry.getKey(), entry.getValue()); as this ensures that a header will be replaced if it exists, and added if it does not.~~

Upon further investigation this issue is caused by multiple threads obtaining and using the same HTTP connection during the upload process. I'm still attempting to determine the cause and resolution for this, but signs are pointing towards the BatchUploader and the shared HttpClient instance.

hcoona commented 7 years ago

Greate progress, thank you deAtog!

bozaro commented 7 years ago

I created test repository on VSO and would fix this issue soon:

deAtog commented 7 years ago

I'm still not sure what the correct fix is for this is. My understanding of how the HttpClient should work when using the PoolingHttpClientConnectionManager is that a pool of persistent HTTP connections are created. These connections are then shared whenever a request is executed by the HttpClient by taking an unused connection from the pool and returning it once the request has completed. By default, the size of the pool is limited to 2 connections per host. My understanding is that it the use of the PoolingHttpClientConnectionManager should prevent two requests from using the same connection at the same time. This however does not appear to be the case.

A simple work-around for this issue is to populate the destination repository with the LFS objects, by NOT specifying the git repository or lfs server with -g, --git, -l, or --lfs options to git-lfs-migrate. This will cause git-lfs-migrate to populate the destination repository with the lfs objects which can subsequently be pushed using git lfs push according to directions here: https://help.github.com/articles/duplicating-a-repository/#mirroring-a-repository-that-contains-git-large-file-storage-objects