DOMjudge / checktestdata

checks integrity of test data in programming contests like the ACM ICPC
Other
22 stars 25 forks source link

Improve performance #3

Open meisterT opened 8 years ago

meisterT commented 8 years ago

For some checktestdata programs performance is terrible. Are there low hanging fruits that improve the general performance of checktestdata?

eldering commented 7 years ago

Some issues to look into:

eldering commented 7 years ago

These combined fixes improved checktestdata runtime on this testcase from ~8 min to ~2 sec.

austrin commented 7 years ago

An example where I gave up on using checktestdata and switched to python due to speed: problem I (Interception) from NCPC 2016. Problem package available here: http://ncpc.idi.ntnu.no/ncpc2016/ncpc2016-packages.tar.bz2 (there is an unused checktestdata script included).

There is lots of data so maybe there is not much to do, but one feature that would have been helpful here would be a version of the UNIQUE function that compares entries as unordered (multi)sets rather than ordered tuples. Then one could have skipped the annoying if-swap in the main loop, which takes a bit of the time (and pollutes the code). (That feature alone wouldn't have made the script fast enough, but it would at least have made it a bit less slow.)

eldering commented 7 years ago

I've tested this on my computer. The test case 09.max_ans.in check runs in 31 seconds, without the swap it takes 20 seconds. That's a decent bit less, but I'm not sure it warrants introducing a completely new command. I also analyzed the call graph with callgrind and a quick glance did not show any clear targets for optimization. A fair amount of time (14%) Is spent in GMP functions, another 10% in boost::variant, and 8% just in std::__lexicographical_compare_impl over GMP integer types.

austrin commented 7 years ago

That's a decent bit less, but I'm not sure it warrants introducing a completely new command.

Right, I agree that this speed improvement on its own is not sufficient reason to add such a command, but there may be other small reasons as well which together make it worth it. Another such reason could be that this type of constraint is somewhat common and the swap dance is a bit annoying and makes the scripts harder to read, so such a command could be nice to have regardless of performance issues. I don't really have a strong opinion on this, but at least wanted to bring up the possibility for you to consider.

austrin commented 6 years ago

@eldering having been frustrated by a few slow ctd scripts in the past weeks, I was wondering a bit more about potential speed improvements. One thought: would it be possible to add some sort of pragma-ish directive to force 64-bit integer arithmetic instead of arbitrary precision (since GMP takes a lot of the time)? I guess one would then want to do overflow checks every time one does an operation (and raise an error on overflow), but maybe it would still be faster, in particular for the common "read a million integers between 1 and 1000" type of format.

eldering commented 6 years ago

[Funny, I was just looking into a few minor checktestdata issues, including this one.]

Yes, I think that would be a good avenue to explore, and would make sense if it improves performance significantly. I currently don't have the WF2018 problemset available. Is is (publicly) available somewhere? Otherwise, I'd have to wait a little before our backups are online.

austrin commented 6 years ago

I don't think so, but one set you could look at is NWERC 2017, in particular the problems ascendingphoto and factorfree. The validators look almost the same, except that one stores the values in an array (without using them for anything), and that one is about 3x slower than the other. But even the faster one is a bit sluggish -- on my machine it takes around 1.5s just to read a million numbers which with a lot of test files gets a bit annoying.

TPolzer commented 5 years ago

I have an experimental branch of checktestdata in antlr_rewrite that seems to be a lot better performance wise, on the other hand it is by far not as well tested and it currently doesn't really have helpful error messages for non conforming input.

thijskh commented 5 years ago

Well, the testsuite passes, so that's already quite something?

meisterT commented 5 years ago

What's the plan for @TPolzer's rewrite? Can we merge that?

TPolzer commented 5 years ago

The one area where it would be a clear regression is that it cannot generate inputs. Is that a feature people actually use?

TPolzer commented 5 years ago

It also lacks an option to be generous with white space, but that would probably be easy to implement.

eldering commented 5 years ago

Summary from a short discussion on #domjudge on the antlr rewrite: I'd prefer to keep using Make instead of Bazel since Make is available everywhere. If we can improve a few of the missing features (and aim for feature parity in the longer run), then I'm fine with replacing the current code with this rewrite.

meisterT commented 5 years ago

@TPolzer can you please describe how to build the current version of the antlr branch, i.e. which dependencies have to be installed?

TPolzer commented 5 years ago

Yes, .travis.yml isn't too bad a guide, necessary build dependencies are

Then do 'bazel test test' to run the testsuite 'bazel build checktestdata' to build the main executable (which will end up at bazel-bin/checktestdata)

TPolzer commented 5 years ago

I'm not a big fan of make and actually think that having the build depend on bazel and then host prebuilt static binaries as github releases might be a nicer experience for everybody involved.

eldering commented 5 years ago

Can you put your version in a separate directory? I don't like replacing the current version completely for a number of reasons. First, the current version still has features that your code does not (yet) support, such as generating test data. Secondly, Bazel is neither Debian or Ubuntu, which makes that make at this point is still more of an off the shelf build system. I agree we can release static binaries, but that can be done either way, and I don't think that should be the only form of releases, for example, being able to create debian packages is also a nice feature, which doesn't work with Bazel at the moment.

meisterT commented 5 years ago

I was thinking about this issue a bit more. We have to separate the group of contributors to checktestdata from the group of users of checktestdata to come up with a qualified decision how to proceed.

For users, the benefit of the rewrite is a much faster checktestdata than the old version - on average it's close to a 10x improvement. Users typically don't care how we achieve that speedup, the normal user also doesn't need to build the software from source. We haven't been updating the checktestdata debian package in a while and haven't been distributing static binary releases, but we probably should do both in any case. We should probably also look into distributing it via homebrew.

For contributors on the other hand, the build system matters more, but it's also a much smaller group of people. Overall, we had contributions from 8 people in total.

For Bazel there's a debian package which can be installed like this:

echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel

So it's not hard to install (and keep it up to date).

btw. Bazel supports building debian packages itself: https://docs.bazel.build/versions/master/be/pkg.html#pkg_deb (disclaimer: I've never used the packaging rules)

Note that I was also trying to add a Makefile in the new branch while I was trying to make up my mind. However, the antlr version in debian stable cannot generate cpp code: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=902798

Regarding the idea of putting the binary in a separate directory: I don't really like this because it increases the maintenance overhead. We have the original version, the haskell implementation and now the antlr_rewrite. Any change we make we have to implement in every version either slowing down development or diverging the implementations. We saw how that didn't work for the branches in the DOMjudge repository so I would rather prefer not do the same mistake here.

To summarize: users benefit greatly from the new version, while it gets a little bit harder for us as contributors. I think this is a fair tradeoff to make.

I propose to do the following:

nickygerritsen commented 5 years ago

I propose to maybe also look into making a Docker image with a checktestdata binary in it when we do static releases.

I agree with the rest of your ideas

austrin commented 5 years ago

We have to separate the group of contributors to checktestdata from the group of users of checktestdata to come up with a qualified decision how to proceed.

Some further comment/question on this. While not being a contributor, from the view of problemtools (living downstream of checktestdata and maybe contributing a non-negligible fraction of checktestdata's users?) it would be desirable if checktestdata continues to be easily buildable using standard packages. But that 10x speedup would be so sweet to get.

If I understand correctly the bazel-vs-make-vs-whatever-floats-your-boat issue is in principle independent of the rewrite, and there is no inherent reason why it would have to use bazel?

Is the bazel part only needed to build the parser, or is it needed also for buliding the binary from the parsers (i.e., as on the current release branch)?

meisterT commented 5 years ago

Well, currently bazel is used for everything, and it's probably cumbersome to have two build systems for one project.

Just to rule out that possibility: would it work for you to distribute the released checktestdata binary instead? Or that you require installing checktestdata separately like you require installing other packages (https://github.com/Kattis/problemtools#requirements-and-compatibility) and also compilers.

eldering commented 5 years ago

I am still not convinced we need to use Bazel at all. I prefer make for the same reasons Per also mentioned, and I'm not a fan of installing software from sources outside of Debian.

On November 15, 2018 8:08:12 AM GMT+01:00, Tobias Werth notifications@github.com wrote:

Well, currently bazel is used for everything, and it's probably cumbersome to have two build systems for one project.

Just to rule out that possibility: would it work for you to distribute the released checktestdata binary instead? Or that you require installing checktestdata separately like you require installing other packages (https://github.com/Kattis/problemtools#requirements-and-compatibility) and also compilers.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/DOMjudge/checktestdata/issues/3#issuecomment-438939915

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

meisterT commented 5 years ago

Jaap, what's your concrete suggestion then? Waiting until the next debian version is released?

eldering commented 5 years ago

I'm not sure I follow. I'm ok with depending on antlr4 from Debian Buster (since that's at least going to be in the upcoming release, and thus also in Ubuntu). But I'm proposing to keep/rewrite the build system in make instead of Bazel. Is there any reason this cannot be done?

meisterT commented 5 years ago

Ah ok, I was implicitly assuming that you insist on debian stable. If that's not the case, I'll try to cook something up that works on buster without bazel.

austrin commented 5 years ago

In case anyone else is interested in how much faster the antlr rewrite is, I made the following plot:

ctd_antlr_rewrite_speed

Each dot is a checktestdata script for a problem. Its x coordinate is the sum of runtimes (in seconds) of the script on all test cases using checktestdata version 20180411, and its y coordinate is the sum of runtimes using the antlr rewrite. The dotted line is the y=x line. Note that the plot uses log-log axes. (The plot excludes all scripts where the two checktestdata versions currently produce different results, cf #7.)

There's really only one script where the antlr-rewrite does significantly worse than the current checktestdata, and I've made it available here if you want to investigate it further: https://gist.github.com/austrin/65576baa9aa8601e3b06da2cc8cad26d (yes, it's a funny one)

TPolzer commented 5 years ago

That tight cluster around 100ms where there seems to be a small constant factor regression looks interesting, too. The a bit larger regression on the very fast cases I've also observed and I think is mostly startup overhead of the antlr parser itself.

meisterT commented 5 years ago

Thanks Per. Do you mind sharing the raw data points?

austrin commented 5 years ago

@TPolzer re the cluster around 100 ms, I think it is also just about the startup overhead you mention. Below is a similar plot for individual test cases (i.e., each dot is a run of a checktestdata on a single test case). As you can see the cases where the rewrite is slower are almost entirely runs in the sub-10 ms range (and then many problems consist of around 20 or so such such test cases which then causes the cluster in the previous graph).

ctd-runtimes

TPolzer commented 5 years ago

I've looked a bit into the gcc 6 issue and it's not trivial, because I used "constexpr if" liberally (variant and optional can be taken from boost, string_view from absl). Postponing that.

meisterT commented 4 years ago

If anyone wants to build this branch themselves, I have updated installation instructions (see https://github.com/DOMjudge/checktestdata/blob/antlr_rewrite/README.md).

I also added this branch to our gitlab CI which now uploads the resulting binary, so you don't have to build yourself: https://gitlab.com/DOMjudge/checktestdata/-/jobs

TPolzer commented 4 years ago

Thanks! Reading those instructions, which part of the test suite is expected to be non deterministic, and why?

On Sun, Oct 20, 2019, 15:37 Tobias Werth notifications@github.com wrote:

If anyone wants to build this branch themselves, I have updated installation instructions (see https://github.com/DOMjudge/checktestdata/blob/antlr_rewrite/README.md).

I also added this branch to our gitlab CI which now uploads the resulting binary, so you don't have to build yourself: https://gitlab.com/DOMjudge/checktestdata/-/jobs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DOMjudge/checktestdata/issues/3?email_source=notifications&email_token=ABSX4SQXXGSEESOY4DVBJLLQPSXRNA5CNFSM4CFLKGU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBYSATQ#issuecomment-544284750, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSX4SXBEKG4ZFWJWZ2PTTLQPSXRNANCNFSM4CFLKGUQ .

meisterT commented 4 years ago

I wondered the same yesterday but didn't have time to investigate.

Today I did some test cleanups (cherry-picked changes from master), disabled tests that do not work on the antlr branch and ran with --runs_per_test=1000 and all iterations passed.

Then I realized that the non-determinism comes from the part we don't support (yet?) in the antlr rewrite: the testdata generation.