redrabbit / git.limo

A Git source code management tool powered by Elixir with easy installation & high extensibility.
https://git.limo
MIT License
497 stars 42 forks source link

Error when pushing large repos over HTTP #39

Closed codeadict closed 5 years ago

codeadict commented 5 years ago

Elixir Version: Elixir 1.8.2 (compiled with Erlang/OTP 21) OS: OSX Mojave 10.14.4

Description:

Tried pushing the Linux kernel(https://github.com/torvalds/linux) and also the Phoenix framework over HTTP and the git client finishes with error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500. On the GitGud side i'm getting the following traceback:

[error] #PID<0.689.0> running GitGud.Web.Endpoint (connection #PID<0.683.0>, stream id 3) terminated
Server: localhost:4000 (http)
Request: POST /codeadict/phoenix/git-receive-pack
** (exit) an exception was raised:
    ** (ArgumentError) argument error
        :erlang.bit_size([])
        (gitrekt) lib/gitrekt/packfile.ex:36: GitRekt.Packfile.parse/2
        (gitrekt) lib/gitrekt/wire_protocol/receive_pack.ex:126: GitRekt.WireProtocol.ReceivePack.next/2
        (gitrekt) lib/gitrekt/wire_protocol.ex:134: GitRekt.WireProtocol.exec_next/3
        (gitrekt) lib/gitrekt/wire_protocol.ex:142: GitRekt.WireProtocol.exec_all/3
        (gitrekt) lib/gitrekt/wire_protocol.ex:129: GitRekt.WireProtocol.exec_run/3
        (gitgud) lib/gitgud/smart_http_backend.ex:167: GitGud.SmartHTTPBackend.git_exec/3
        (gitgud) lib/gitgud/smart_http_backend.ex:154: GitGud.SmartHTTPBackend.git_pack/3
        (gitgud) lib/gitgud/smart_http_backend.ex:58: anonymous fn/2 in GitGud.SmartHTTPBackend.do_match/4
        (gitgud) lib/plug/router.ex:259: GitGud.SmartHTTPBackend.dispatch/2
        (gitgud) lib/gitgud/smart_http_backend.ex:1: GitGud.SmartHTTPBackend.plug_builder_call/2
        (phoenix) lib/phoenix/router/route.ex:39: Phoenix.Router.Route.call/2
        (phoenix) lib/phoenix/router.ex:275: Phoenix.Router.__call__/1
        (gitgud_web) lib/gitgud_web/endpoint.ex:1: GitGud.Web.Endpoint.plug_builder_call/2
        (gitgud_web) lib/plug/debugger.ex:122: GitGud.Web.Endpoint."call (overridable 3)"/2
        (gitgud_web) lib/gitgud_web/endpoint.ex:1: GitGud.Web.Endpoint.call/2
redrabbit commented 5 years ago

Hi @codeadict and thank you for reporting this issue.

I tried to push the Phoenix repository via HTTP and got the same error. After some debugging session, I can see that the error happens when inflating incoming zlib data. The actual error I got is:

{:zlib_data_error, "invalid distance too far back"}

which seems to be caused when zlib tries to inflate/decompress corrupt data.

Now there are a few things that could lead to this problem:

1) The GitRekt.WireProtocol.ReceivePack implementation misbehaves when using HTTP. 1) The GitRekt.Packfile implementation is wrong somewhere and the data passed to zlib is invalid. 2) The GitRekt.Git.object_zlib_inflate/1 NIF implementation is wrong when inflating big chunks.

Pushing the same repository via SSH (which would be great if you could give a try) works for me.

The main difference between the two protocols is that SSH starts unpacking data as soon as it receives anything from the client while HTTP needs to receive the entire PACK first.

Because the chunks received via SSH do not always align with the objects from the PACK file, GitRekt.Packfile provides a way to break the parsing process in multiple steps if the client has not sent enough data (see GitRekt.Packfile.parse/2).

From my debug logs, I can see that the failing PACK objects should return 120628 of zlib decompressed bytes. Currently, the Git.object_zlib_inflate/1 implementation has a chunk limit size of 16384 (16K), so it should return 8 chunks (which is nothing unusual)...

codeadict commented 5 years ago

Thanks for your quick response. I've got both the error i posted and the one you mentioned, at this point i'm not sure which of the three things could be the cause but will be nice to find out, i don't have much time currently but will try to get some for debugging.

redrabbit commented 5 years ago

I think 77ecba8 fixes this issue.

I just closed #55 with commit 16008cd which streams the HTTP request instead of reading it all at once. I tried to push and clone repositories of various size including elixir-lang/elixir (15.900) and phoenix-framework/phoenix (6.095) and did not encounter any error.

I did not manage to push very large repositories such as golang/go (41.033). With HTTP I got timeout errors while the SSH implementation never returns:

$ git push local2                                                                                                                                                                                                                                                           
Enumerating objects: 378973, done.
Counting objects: 100% (378973/378973), done.
Delta compression using up to 8 threads
Compressing objects: 100% (74455/74455), done.
Writing objects: 100% (378973/378973), 187.02 MiB | 995.00 KiB/s, done.
Total 378973 (delta 301502), reused 378968 (delta 301499)
codeadict commented 5 years ago

Awesome, i'm testing it now