Closed lodo1995 closed 8 years ago
sounds like a plan. As usual with (sort of) manual management you have to make sure you do not escape references to freed or overwritten buffers. But before you write the "BufferedRangeLexer" have your continuous benchmarking in place (unless you have done that already, and I missed that) so you can track the difference.
Although not 100% complete, the benchmarking code is there. Just tweak some parameters at the beginning of random_benchmark.d
, run make clean-random-benchmark
, then make random-benchmark > results.txt
. This way you can save the results on a file for comparison with future benchmarks. The results file also contains the configuration parameters, so that you know what you tweaked in random_benchmark.d
.
The thing is, you have to do that by hand. And then you have to do the comparison by hand. I think it would be really helpful if you had graphs showing you all that. Have a look at https://github.com/dlang/phobos/pull/2995 http://code.dlang.org/packages/std_benchmark . I think you already have all the data, you just need to dump it in a way that something like gnuplot can handle.
Understood. Code to produce a CSV should be an almost one-liner. I'll also make a script to feed gnuplot. This discussion belongs to issue #10 .
Ping me if you want any extra help/advice on speeding up lexers.
@Hackerpilot thanks
@Hackerpilot thank you very much. I'll upload some work on this as soon as possible.
@Hackerpilot @burner I implemented ForwardLexer
and BufferedLexer
.
In terms of performance, BufferedLexer
is asymptotically equal to RangeLexer
for very small buffers and asymptotically equal to SliceLexer
for very big ones (as expected).
ForwardLexer
didn't brought the expected performance gain with respect to RangeLexer
, being only slightly faster.
well I would say that is a good result. If you made a mistake somewhere you at least made it three times ;-)
When[1] you have some graphs please share them.
[1] when the csv export gnuplot import is done
Here is the first meaningful graph, comparing the performance of the various lexers, with files of increasing sizes. I'll upload the code to generate the graph soon.
For the time being, I'm quite happy with the lexers, so I'm closing this issue.
While the
SliceLexer
is quite fast, it requires the entire input to be loaded in memory beforehands. On the other side, the currentRangeLexer
is painfully slow.It may be useful to add a
ForwardLexer
, which requires its input to be at least aForwardRange
, using this information to speed up the reading process, in particular the allocation of memory.It's also necessary to add a
BufferedLexer
, which takes as input anInputRange
of slices. It will be very useful for buffered reads from files, having a speed comparable to theSliceLexer
whenever the token is not on a buffer boundary, but not needing a huge amount of memory.