Open almondtools opened 2 years ago
Thanks for the note. I'll have a look at your benchmarks, and keep them in mind.
Right now, I have a few things that I think need to be addressed before I push this to maven, and cut a 0.1 release.
@almondtools I was looking at the benchmarks--are there any scripts for handling the output?
I am not certain to understand ... I would suggest that you implement a triple
extends MatcherBenchmark
implements Automaton
which is referenced in the benchmark (an which is a wrapper of your algorithm)extends MatcherBenchmarkTest
The tests search a pattern in a sample and compare the number of found results with a reference implementation. It is not checked whether all results are found at the correct location. I think the large test corpus (of the scaling benchmarks) prevents that a benchmark passes with pure luck.
Does it help you?
Sorry, my earlier question was a bit vague.
Yes, I was able to implement those in a branch I have locally, and doing so helped me find two bugs in needle.
However, when I run the tests, it seems to give mostly unstructured output to the console. Is there a good technique for turning that data into a table or other format that's good for analysis so I can easily compare my library to others? I didn't know if I missed something in your repo that does that, or if there's a nicer way than reading the results and extracting data by hand.
Probably you found the files *bench*.cmd
. They write the benchmark data to csv and text output (examples are attached), Unfortunately I did not develop tools to analyze or visualize the benchmark results. I did this for stringbench, but it was much effort and is probably not easy to reuse.
I also noticed that the benchmarks will have to be adjusted for other versions of java/jmh, hopefully you have solved this already.
Whoops, my apologies. I overlooked the command files.
I have written a regex benchmark comparing different regex engines for Java. Lately I found your approach and would be curious how it performs compared to the other alternatives: