skvadrik / re2c

Lexer generator for C, C++, D, Go, Haskell, Java, JS, OCaml, Python, Rust, V and Zig.
https://re2c.org
Other
1.11k stars 174 forks source link

Allow -g, -b without -s, case ranges with gcc. #148

Open sirzooro opened 8 years ago

sirzooro commented 8 years ago

I did some benchmarks on my lexer (URL parsing) and got following results for different options used:

Generated code size:

test.cc      2,598,508
test_s.cc    2,071,914
test_sb.cc   2,068,836
test_sbg.cc  6,374,917

Compiled code size:

test.o         530,253
test_s.o       509,687
test_sb.o      527,949
test_sbg.o  11,031,536

Processor time in secs (reported by clock() function)

test       21.734
test_s     26.875
test_sb    23.828
test_sbg   15.516

Code was compiled using gcc 5.3.0 on Cygwin x86_64 with options -O2 -march=native, which translates to -O2 -march=sandybridge -maes -mavx -mcx16 -mfxsr -mmmx -mpclmul -mpopcnt -msahf -mxsaveopt. I did not use -g.

As you can see, code generated with -s option cannot be optimized as good as code generated without this option. -b improved things a bit, but code still is worse than baseline. But after adding -g code sped up significantly. I suspect than if code used computed gotos only it would be even better. -b -g combination is also worth a try, it may give additional performance boost. Could you implement this?

skvadrik commented 8 years ago

Oh yes, -s being a slowdown with modern compilers is known fact (I mean I did observe it a couple of times and it has been discussed on the mailing lists). It used to speed up code some 20 years ago (and probably still does if you use tcc or some other non-optimising compiler) .

-g without -s is worth a try.

sirzooro commented 8 years ago

Above results were for URL with length 18. I did few more benchmarks and got some interesting results:

URL with length 131, many possible URL features used:

test       200.656
test_s     232.609
test_sb    191.000
test_sbg   244.813

This time -sb was the fastest one, and computed gotos the slowest ones.

URL with length 13, this URL was of different type that was simpler to parse:

test       14.359
test_s     13.125
test_sb    12.359
test_sbg   11.532

Here every extra option added some speed boost.

So looks that this heavily depends on input, so in practice it should be measured for typical input data. Anyway, ability to not use -s would be good.

One more thing come to my mind - gcc allows to use case ranges https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html . I did small test (switch with separate cases to recognize upper chars, lower chars, digits and all else) with gcc 5.3.0. It generated the same code if every case value was listed separately and when case ranges were used. I did not test this with older gcc versions, but is possible that they may use it to generate better code. Of course this feature allows to reduce source code size, so parsing time during compilation will be shorter. What do you think about this?

skvadrik commented 8 years ago

Case ranges might be particularly useful with multibyte encodings like UTF-16 and UTF-32, where one is forced to use -s to keep the generated C/C++ code reasonably small.

skvadrik commented 8 years ago

All these options deserve a thorough investigation and measurement, so I probably won't get to it right now (not in the coming release).

shawnl commented 4 years ago

One more thing come to my mind - gcc allows to use case ranges

Both gcc and clang support these, but in clang you are only going to get worse code with this, because it is lowered into code without case ranges for LLVM, while gcc handles ranges in the middle end.

skvadrik commented 4 years ago

Case ranges have been added recently, see this bug for discussion: https://github.com/skvadrik/re2c/issues/290. The option is --case-ranges and there is a corresponding re2c:flags:case-ranges configuration. As mentioned in the linked discussion, case ranges make a serious difference for Tcc (3x smaller binary size, faster compile times).