httprb / http

HTTP (The Gem! a.k.a. http.rb) - a fast Ruby HTTP client with a chainable API, streaming support, and timeouts
MIT License
3.01k stars 321 forks source link

Initial implementation for llparser binding #639

Closed midnight-wonderer closed 3 years ago

midnight-wonderer commented 3 years ago

Hello,
And ping https://github.com/httprb/http/issues/604, https://github.com/httprb/http/issues/630, https://github.com/httprb/http/issues/622.

I tried hooking the llparser in; it is actually not that hard.
I almost got every test passed; only one failed.

The test that failed is in client_spec.rb. The context is: "with broken body (too early closed connection)". The failure seems to be related to the error raised in the old parser.rb in add method.
I have no idea how this thing works, maybe a leaky abstraction?

On the implementation details I used https://github.com/metabahn/llhttp, which is not quite production-ready. The gem did not handle the compilation properly and required some patch to work with http.rb. The patch in question is minimal, just:

VALUE rb_llhttp_http_major(VALUE self) {
  llhttp_t *parser;

  Data_Get_Struct(self, llhttp_t, parser);

  return UINT2NUM(parser->http_major);
}

VALUE rb_llhttp_http_minor(VALUE self) {
  llhttp_t *parser;

  Data_Get_Struct(self, llhttp_t, parser);

  return UINT2NUM(parser->http_minor);
}

and

  rb_define_method(cParser, "http_major", rb_llhttp_http_major, 0);
  rb_define_method(cParser, "http_minor", rb_llhttp_http_minor, 0);

For testing, you have to compile the extension manually after patching.

In summary, replacing the gem should not take that long, and this PR can be a starting point.

midnight-wonderer commented 3 years ago

Digging further into this, I am not sure if the test is correct.
Here what the spec says:

If a Transfer-Encoding header field (section 14.41) is present and has any value other than "identity", then the transfer-length is defined by use of the "chunked" transfer-coding (section 3.6), unless the message is terminated by closing the connection.

So, if I were to interpret it verbatim, I would not raise an error there; even if the chunked transfer seems unfinished, the closing connection should precede it in this case.

Related commit: 72723df Ping: @Bonias

tarcieri commented 3 years ago

Related issue: https://github.com/httprb/http/pull/424

bryanp commented 3 years ago

@midnight-wonderer What compilation issues did you hit with llhtttp?

midnight-wonderer commented 3 years ago

The llhttp gem just shipped with .bundle to rubygems.org, no compilation on install. My workstation runs Linux, and it looks for .so, which is not there.

So an auto-deployed ruby project can't have the gem as the dependency yet.

bryanp commented 3 years ago

@midnight-wonderer It shouldn't include .bundle at all, that's a mistake. But it installs fine for me on ubuntu:

root@ubuntu-s-1vcpu-1gb-nyc3-01:~# apt-get -y update && apt-get -y install ruby ruby-dev build-essential
...
root@ubuntu-s-1vcpu-1gb-nyc3-01:~# gem install llhttp
Fetching llhttp-0.0.2.gem
Building native extensions. This could take a while...
Successfully installed llhttp-0.0.2
Parsing documentation for llhttp-0.0.2
Installing ri documentation for llhttp-0.0.2
Done installing documentation for llhttp after 0 seconds
1 gem installed
root@ubuntu-s-1vcpu-1gb-nyc3-01:~#

Please open an issue at https://github.com/metabahn/llhttp if you want to discuss the issues you're hitting.

tarcieri commented 3 years ago

Just to chime in here, I think moving to https://github.com/metabahn/llhttp sound great, but I think it'd be a lot more painless if the C extension shipped the source code as opposed to requiring a .so native dependency be installed on target systems or else installation fails.

Shipping the source code with the gem doesn't preclude dynamic linking for those who prefer it: it can still link with a dynamic library if detected. But failing to install at all without a dynamic library present is a bad user experience, IMO.

bryanp commented 3 years ago

@tarcieri I agree with you, but llhttp does ship the source code.

tarcieri commented 3 years ago

Aah, great! So it sounds like the installation issues are a bug?

bryanp commented 3 years ago

Yeah, it seems that way. I haven't been able to reproduce so hopefully @midnight-wonderer can provide more details.

tarcieri commented 3 years ago

Another thing to consider is JRuby.

Given https://github.com/metabahn/llhttp is written as an MRI C extension, we'd need to disable it on JRuby, but then the question is what to use instead. Perhaps we could include a pure Ruby alternative parser, or find a Java parser with a JRuby extension.

FFI would allow for using a common native extension on both platforms, which certainly simplifies things as there's only one parser to worry about.

bryanp commented 3 years ago

I'm certainly open to moving llhttp to FFI.

Btw it turns out I can reproduce @midnight-wonderer's issue. Requiring llhttp fails in the ubuntu example I posted above:

irb(main):001:0> require "llhttp"
Traceback (most recent call last):
       12: from /usr/bin/irb:23:in `<main>'
       11: from /usr/bin/irb:23:in `load'
       10: from /usr/lib/ruby/gems/2.7.0/gems/irb-1.2.1/exe/irb:11:in `<top (required)>'
        9: from (irb):1
        8: from /usr/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:156:in `require'
        7: from /usr/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:168:in `rescue in require'
        6: from /usr/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:168:in `require'
        5: from /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp.rb:3:in `<top (required)>'
        4: from /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp.rb:6:in `<module:LLHttp>'
        3: from /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp.rb:6:in `require_relative'
        2: from /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp/parser.rb:29:in `<top (required)>'
        1: from /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp/parser.rb:29:in `require_relative'
LoadError (cannot load such file -- /var/lib/gems/2.7.0/gems/llhttp-0.0.2/lib/llhttp/llhttp_ext)

I'll dig in later to see what's going on.

bryanp commented 3 years ago

I took a couple of hours to reimplement llhttp with FFI: https://github.com/metabahn/llhttp/pull/4. It seems to work, but it needs more testing and I'd like to do some benchmarking against the MRI C extension before committing to FFI. Open to your feedback!

ixti commented 3 years ago

:+1: for the move to llhttp, but as @tarcieri said we should ensure we support jRuby as well. Thus if llhttp will use ffi - that will be awesome!

bryanp commented 3 years ago

I just pushed a new release of llhttp along with a new llhttp-ffi gem for compatibility outside of MRI (see the changelogs for specific changes). The MRI implementation is ~4x more performant than the FFI implementation, so generally speaking llhttp-ffi should only be used to provide backwards-compatibility to a project.

Please let me know how I can help bring this across the finish line—happy to support the effort.

tarcieri commented 3 years ago

@bryanp perhaps you could make a new PR? (unless @midnight-wonderer wants to update this one)

bryanp commented 3 years ago

@tarcieri done: https://github.com/httprb/http/pull/651

ixti commented 3 years ago

651 was merged to master. Huge thanks to everyone involved!