Open sirzooro opened 8 years ago
Oh yes, -s
being a slowdown with modern compilers is known fact (I mean I did observe it a couple of times and it has been discussed on the mailing lists). It used to speed up code some 20 years ago (and probably still does if you use tcc or some other non-optimising compiler) .
-g
without -s
is worth a try.
Above results were for URL with length 18. I did few more benchmarks and got some interesting results:
URL with length 131, many possible URL features used:
test 200.656
test_s 232.609
test_sb 191.000
test_sbg 244.813
This time -sb
was the fastest one, and computed gotos the slowest ones.
URL with length 13, this URL was of different type that was simpler to parse:
test 14.359
test_s 13.125
test_sb 12.359
test_sbg 11.532
Here every extra option added some speed boost.
So looks that this heavily depends on input, so in practice it should be measured for typical input data. Anyway, ability to not use -s
would be good.
One more thing come to my mind - gcc allows to use case ranges https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html . I did small test (switch with separate cases to recognize upper chars, lower chars, digits and all else) with gcc 5.3.0. It generated the same code if every case value was listed separately and when case ranges were used. I did not test this with older gcc versions, but is possible that they may use it to generate better code. Of course this feature allows to reduce source code size, so parsing time during compilation will be shorter. What do you think about this?
Case ranges might be particularly useful with multibyte encodings like UTF-16 and UTF-32, where one is forced to use -s
to keep the generated C/C++ code reasonably small.
All these options deserve a thorough investigation and measurement, so I probably won't get to it right now (not in the coming release).
One more thing come to my mind - gcc allows to use case ranges
Both gcc and clang support these, but in clang you are only going to get worse code with this, because it is lowered into code without case ranges for LLVM, while gcc handles ranges in the middle end.
Case ranges have been added recently, see this bug for discussion: https://github.com/skvadrik/re2c/issues/290. The option is --case-ranges
and there is a corresponding re2c:flags:case-ranges
configuration. As mentioned in the linked discussion, case ranges make a serious difference for Tcc (3x smaller binary size, faster compile times).
I did some benchmarks on my lexer (URL parsing) and got following results for different options used:
Generated code size:
Compiled code size:
Processor time in secs (reported by
clock()
function)Code was compiled using gcc 5.3.0 on Cygwin x86_64 with options
-O2 -march=native
, which translates to-O2 -march=sandybridge -maes -mavx -mcx16 -mfxsr -mmmx -mpclmul -mpopcnt -msahf -mxsaveopt
. I did not use-g
.As you can see, code generated with
-s
option cannot be optimized as good as code generated without this option.-b
improved things a bit, but code still is worse than baseline. But after adding-g
code sped up significantly. I suspect than if code used computed gotos only it would be even better.-b -g
combination is also worth a try, it may give additional performance boost. Could you implement this?