JuliaWeb / HttpParser.jl

Deprecated! Julia wrapper for joyent/http-parser
MIT License
13 stars 37 forks source link

InexactError or OutOfMemory Error for on_body callback #32

Closed binarybana closed 5 years ago

binarybana commented 9 years ago

Using Julia 0.4 nightly with LLVM 3.5 (nalimilan's Fedora nightlies) and LLVM 3.6.0 (compiled from source) on Fedora 21 x86_64. I get some test failures stemming from the on_body callback.

When instrumenting on_body as:

function on_body(parser, at, len)                                                                                                                                            
  @show at, len                                                                                                                                                              
  @show typeof(at), typeof(len)                                                                                                                                              
    r.data = string(r.data, bytestring(convert(Ptr{Uint8}, at)), int(len))                                                                                                   
    return 0                                                                                                                                                                 
end  

I get

julia test/runtests.jl
[ 0.4 dep warnings removed here ]
(at,len) = (Ptr{Int8} @0x00007f2852c1153e,0xffffffffffffffff)
(typeof(at),typeof(len)) = (Ptr{Int8},UInt64)
ERROR: LoadError: InexactError()
 in on_body at /root/.julia/v0.4/HttpParser/test/runtests.jl:95
while loading /root/.julia/v0.4/HttpParser/test/runtests.jl, in expression starting on line 131

It looks like there's some type of corruption occurring, but only with on_body. When using get from Requests.jl, I found that it almost seemed like the at and len arguments were swapped, and swapping them as:

    function on_body(parser, at, len)                                                                                                                                        
      oldat = at                                                                                                                                                             
       at = convert(Ptr{Uint8}, len)                                                                                                                                         
       len = convert(Csize_t, oldat)                                                                                                                                         
        r = pd(parser).current_response                                                                                                                                      
        @show at, len                                                                                                                                                        
        @show typeof(at), typeof(len)                                                                                                                                        
        r.data = string(r.data, bytestring(convert(Ptr{Uint8}, at), len))                                                                                                    
        return 0                                                                                                                                                             
    end     

seemed to make all the callbacks (namely headers) appear to work. ie:

julia> y = get("http://httpbin.org/get")
WARNING: [a,b,...] concatenation is deprecated; use [a;b;...] instead
 in depwarn at ./deprecated.jl:42
 in oldstyle_vcat_warning at ./abstractarray.jl:28
 in vect at ./abstractarray.jl:37
 in render at /root/.julia/v0.4/Requests/src/Requests.jl:23
 in open_stream at /root/.julia/v0.4/Requests/src/Requests.jl:232
 in open_stream at /root/.julia/v0.4/Requests/src/Requests.jl:215
 in do_request at /root/.julia/v0.4/Requests/src/Requests.jl:577
 in get at /root/.julia/v0.4/Requests/src/Requests.jl:591
 in get at /root/.julia/v0.4/Requests/src/Requests.jl:590
(at,len) = (Ptr{UInt8} @0x00007fd6cfdfc8f0,0x000000000000000a)
(typeof(at),typeof(len)) = (Ptr{UInt8},UInt64)
Response(200 OK, 10 Headers, 10 Bytes in Body)

julia> y.headers
Dict{AbstractString,AbstractString} with 10 entries:
  "current_header"   => ""
  "true"             => ""
  "http_major"       => "1"
  "Content-Type"     => "text/html; charset=utf-8"
  "Date"             => "Wed, 13 May 2015 16:08:57 GMT"
  "Content-Language" => "en"
  "http_minor"       => "1"
  "Keep-Alive"       => "1"
  "Server"           => "Julia/0.4.0-dev+4817"
  "status_code"      => "200"

julia> y.data
"P\x01\U9057f\0\0_\n"

Though the data still seems to be wrong.

Any thoughts?

IainNZ commented 9 years ago

Has this ever come back?

binarybana commented 9 years ago

On a freshly updated version from nalimilan's nightly repo, I'm now getting:

ulia> using Requests
Warning: could not import Base.put into Requests
get("http://httpbin.org/get")
julia> get("http://httpbin.org/get")
WARNING: [a,b,...] concatenation is deprecated; use [a;b;...] instead
 in vect at abstractarray.jl:38
 in render at /home/jason/.julia/v0.4/Requests/src/Requests.jl:23
 in open_stream at /home/jason/.julia/v0.4/Requests/src/Requests.jl:224
 in open_stream at /home/jason/.julia/v0.4/Requests/src/Requests.jl:207
 in do_request at /home/jason/.julia/v0.4/Requests/src/Requests.jl:569
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:583
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:582
while loading no file, in expression starting on line 0
WARNING: int(x) is deprecated, use Int(x) instead.
 in on_header_value at /home/jason/.julia/v0.4/Requests/src/Requests.jl:122
 in http_parser_execute at /home/jason/.julia/v0.4/HttpParser/src/HttpParser.jl:106
 in process_response at /home/jason/.julia/v0.4/Requests/src/Requests.jl:234
 in do_request at /home/jason/.julia/v0.4/Requests/src/Requests.jl:569
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:583
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:582
while loading no file, in expression starting on line 0
ERROR: ArgumentError: cannot convert NULL to string
 in on_header_value at /home/jason/.julia/v0.4/Requests/src/Requests.jl:122
 in http_parser_execute at /home/jason/.julia/v0.4/HttpParser/src/HttpParser.jl:106
 in process_response at /home/jason/.julia/v0.4/Requests/src/Requests.jl:234
 in do_request at /home/jason/.julia/v0.4/Requests/src/Requests.jl:569
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:583
 in get at /home/jason/.julia/v0.4/Requests/src/Requests.jl:582

Referring to this code:

function http_parser_execute(parser::Parser, settings::ParserSettings, request)
    ccall((:http_parser_execute, lib), Csize_t, 
            (Ptr{Parser}, Ptr{ParserSettings}, Ptr{Uint8}, Csize_t,), 
            &parser, &settings, convert(Ptr{Uint8}, pointer(request)), sizeof(request))
    if errno(parser) != 0
        throw(HttpParserError(errno(parser)))
    end 
end

So, another error, but with a similar flavor?

IainNZ commented 9 years ago

Looks like the error is being thrown in Requests though?

wildart commented 9 years ago

Yes, it was in Requests. I fixed it.

binarybana commented 9 years ago

Thanks @wildart, that indeed fixes that error. Which unfortunately returns me back to the original error:

julia> using Requests
Warning: could not import Base.put into Requests

julia> get("http://httpbin.org/get")
at = Ptr{Int8} @0x00007f39b99f77d0
len = 0x00007f39b9fd7340
ERROR: OutOfMemoryError()
 in on_body at /home/jason/.julia/v0.4/Requests/src/Requests.jl:149

Where I've instrumented at and len as follows:

function on_body(parser, at, len)
        r = pd(parser).current_response
        @show at
        @show len 
        r.data = string(r.data, bytestring(convert(Ptr{Uint8}, at), len))
        return 0
    end 

Unfortunately, I don't know where to begin debugging this, so to try and pull my weight, I made a reproducible environment. For those of you running Docker, here is a Dockerfile which reproduces the bug:

FROM fedora:21                                                                                                                                    
RUN yum -y update                                                                                                                                 
RUN yum -y install dnf                                                                                                                            
RUN /usr/bin/dnf install -y dnf-plugins-core && /usr/bin/dnf copr enable -y nalimilan/julia-nightlies && /usr/bin/dnf install -y julia            
RUN dnf install -y nettle gnutls                                                                                                                  
RUN julia -e '[Pkg.add(x) for x in split("Requests")]'                                                                                            
WORKDIR /root/.julia/v0.4/Requests                                                                                                                
RUN git checkout art/httpparser-32                                                                                                                
ENTRYPOINT ["/usr/bin/julia"]                                                                                                                     
CMD ["-e","using Requests; get(\"http://httpbin.org/get\")"]  

Put that in an empty directory as Dockerfile then:

$ docker build -t binarybana/julia-bug:latest .
$ docker run --rm -it binarybana/julia-bug:latest
Warning: could not import Base.put into Requests
ERROR: OutOfMemoryError()
 in on_body at /root/.julia/v0.4/Requests/src/Requests.jl:147

And debug with:

$ docker run --rm -it --entrypoint=/bin/bash binarybana/julia-bug:latest

Edit: this has to be my proudest bug reporting moment ever: Look at that minimum reproducible example! whistles

wildart commented 9 years ago

@binarybana Something wrong with len parameter. It defines size of the body, but it's to big (0x00007f39b9fd7340). Your len value looks more like an address. What version of libhttp_parser do you have?

binarybana commented 9 years ago
$ rpm -qa | grep http
....
http-parser-2.0-7.20121128gitcd01361.fc21.x86_64
wildart commented 9 years ago

Ok, libhttp_parser 2.1.0 does not have this error, neither 2.5.0. Try to follow steps in build.jl to build from sources recent version of libhttp_parser, and then run Pkg.build command to fix path to newly built library (check deps.jl file after).

binarybana commented 9 years ago

I bisected libhttp-parser to find the commit that fixes this bug, it is https://github.com/joyent/http-parser/commit/0938fe599f7e3e4405880216ea445d634a974375

Not sure how to interpret those code changes, but thanks for the workaround!

binarybana commented 9 years ago

I imagine I should file a bug upstream with Fedora to get a newer version packaged then...

wildart commented 9 years ago

Sure. However, I found another OutOfMemoryError in 2.5.0, while I debugged this one.

binarybana commented 9 years ago

Ahh, I got it. https://github.com/JuliaWeb/HttpParser.jl/blob/master/src/HttpParser.jl#L72 only exists in the struct for newer versions of http-parser. Otherwise the size of the struct is incorrect.

If you remove that line, and then the corresponding callback in Requests when it constructs the ParserSettings, then it works with http-parser 2.0.

wildart commented 9 years ago

I guess we need to indicate for which version of libhttp_parse this package works.