nodejs / llparse

Generating parsers in LLVM IR
http://llparse.org
Other
584 stars 30 forks source link

Supplement LLVM's `switch` optimization #10

Closed indutny closed 6 years ago

indutny commented 6 years ago

There're lots of cases where we're matching against the range of the values, e.g. 0x20 - 0xff. LLVM does great job when there's nothing outside of this range, but when there're several ranges, or just few characters outside of the range - the code is optimized into a jump table, which leads to sub-optimal performance.

Thus we should either split the switch into several sub-switches, contribute a patch to LLVM for more sophisticated switch optimization, or generate more elaborate code ourselves.

indutny commented 6 years ago

Should we do bit testing as http-parser does?

indutny commented 6 years ago

This needs to be benchmarked: jump-table generated from large switch vs packed bit test table.

indutny commented 6 years ago

Results:

/tmp/1 "/just/a/normal/url/path"
bitmap[1]: 8192.00 mb | 1424.19 mb/s | 5.75 s
switch[1]: 8192.00 mb | 407.44 mb/s | 20.11 s

(See 80763b4)

indutny commented 6 years ago

Disregard the result above, disabling inlining yields:

$ /tmp/1 "/just/a/normal/url/path"
bitmap: 8192.00 mb | 1139.52 mb/s | 7.19 s
switch: 8192.00 mb | 409.14 mb/s | 20.02 s
indutny commented 6 years ago

Larger input:

$ /tmp/1 "/rather/long/and/boring/url/path/with/lots/of/characters/in/it/to/remove/call/time/from/benchmark"
bitmap: 8192.00 mb | 1206.14 mb/s | 6.79 s
switch: 8192.00 mb | 409.92 mb/s | 19.98 s
indutny commented 6 years ago

Improved url parsing speed in llhttp from 400 mb/s to 800 mb/s!