nodejs / http-parser

http request/response parser for c
MIT License
6.35k stars 1.54k forks source link

[Idea] Rewriting in LLVM DSL #411

Open indutny opened 6 years ago

indutny commented 6 years ago

I know it sounds crazy, but bear with me :wink:

What would you think about rewriting whole project using some JavaScript tool to compile a DSL to a LLVM IR?

I've several reasons for this:

It doesn't sound to complicated, and may finally make our codebase shine.

cc @bnoordhuis @mscdex

indutny commented 6 years ago

Here is an example of what it might look like: https://github.com/indutny/llparse/blob/master/test/fixtures/example.js

bnoordhuis commented 6 years ago

If you're thinking of writing some lexer generator, I could definitely get behind that - I've had the same thought more than once - but why LLVM IR specifically and wouldn't we be reinventing ragel?

indutny commented 6 years ago

@bnoordhuis there're probably better solutions out there, but I always wanted to write something that compiles to LLVM IR. Is it a valid excuse?

On a more serious note, I'd like every state implementation to live in a separate procedure that tail-calls other procedures in the most of the cases. This might turn out to be faster than having a single grand dispatch.

indutny commented 6 years ago

Also on the feature list is having a low-level trie implementation that works seamlessly.

indutny commented 6 years ago

It could probably work in a way similar to Ragel, however it'd have to rely on LLVM to inline the C code into compiled nodes of the state machine graph.

bnoordhuis commented 6 years ago

You can't replace http-parser with something that spits out LLVM IR, that leaves too many users out in the cold. Something that compiles to C or has different back-ends could work though (but then you're half-way on the road to reimplementing ragel or re2c.)

a low-level trie implementation

What for?

indutny commented 6 years ago

We obviously need a trie-like state machine to replace hand-written state code for the methods.

bnoordhuis commented 6 years ago

Not sure I follow. For lexers, you compute the DFA or NFA and then emit state tables or goto-based code. I guess you could implement it as a trie but I don't know why you would.

indutny commented 6 years ago

I didn't mean in-memory trie, as a data-structure. Rather a compiled code consisting of branches and states linked in a trie-like structure.

bnoordhuis commented 6 years ago

Ah, okay. You get that for free with a DFA but I guess a trie is pretty much a DFA encoded as a tree.

indutny commented 6 years ago

I think I'm on the edge of giving up. Just figured out that musttail may not work on arm.

indutny commented 6 years ago

Nvm, it works! 👍

indutny commented 6 years ago

Took few iterations, but I think I've reached some intermediate milestone here: https://github.com/indutny/llparse/blob/master/test/api-test.js

What do you think, @bnoordhuis ?

indutny commented 6 years ago

Here is what it produces: https://gist.github.com/indutny/ac9eedc036a43098b2ad0bf7b63e9f65

indutny commented 6 years ago

@bnoordhuis better example here: https://github.com/indutny/llparse/tree/master/examples/http

bnoordhuis commented 6 years ago

API-wise it looks real nice and the generated code looks tight apart from a hiccup:

$ make -C examples/http
node index.js > http.ll
cc -g3 -Os -flto -fvisibility=hidden -Wall http.ll main.c -o http
warning: overriding the module target triple with x86_64-apple-macosx10.13.0 [-Woverride-module]
1 warning generated.
cannot guarantee tail call due to mismatched parameter counts
  %15 = musttail call fastcc i8* @http_parser__invoke_on_complete(%http_parser_state* %0, i8* %14, i8* %2)
cannot guarantee tail call due to mismatched parameter counts
  %40 = musttail call fastcc i8* @http_parser__invoke_on_complete(%http_parser_state* %0, i8* %39, i8* %2)
cannot guarantee tail call due to mismatched parameter counts
  %10 = musttail call fastcc i8* @http_parser__method(%http_parser_state* %0, i8* %1, i8* %2, i32 0)
LLVM ERROR: Broken module found, compilation aborted!
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [http] Error 1

$ cc -v
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Do you have plans for a C back-end?

indutny commented 6 years ago

The hiccup is due to a bug in -flto, see: https://bugs.llvm.org/show_bug.cgi?id=36441 . I've just pushed a fix to a Makefile in that example.

Here is some real work on porting http parser: https://github.com/indutny/llhttp.


I don't really have plans for a C back-end yet, but will likely have to explore this eventually.

indutny commented 6 years ago

Note that despite not being present in the example, I've just introduced an API for creating callbacks using LLVM IR (instead of a reference to C function): https://github.com/indutny/llparse/blob/81578c8f41a926514a6625b2a9c9941218728408/test/fixtures/index.js#L85-L121

After I'll add an API to extend the state structure from the user code, these compiled code chunks will be able to update the state without ever calling the C-land. C-land calls are sort of expensive right now, as they can't be inlined (due to various machine-specific flags that has to be matched).

Additionally, I plan to introduce "mark"s that would work in the same way as they do in http_parser.c

indutny commented 6 years ago

@bnoordhuis and so we got spans (instead of "mark"s): https://github.com/indutny/llhttp/blob/38fb3aed8c7290aec389a109e2e8cd1c004872c4/lib/llhttp/url.js#L12 . They can be interleaved if we'd ever want to, and they should be efficient.

chadbrewbaker commented 6 years ago

"Getting rid of huge switch statement that isn't optimized well"

Can the switches be permuted to get speedup on average inputs? Perhaps a utility that takes as input a URI corpus file and outputs optimal permutations for the switches?

indutny commented 6 years ago

@chadbrewbaker there's little point in this, switches like the one in http_parser are optimized into jump table.