frankmcsherry / dynamic-datalog

Engines, queries, and data for dynamic Datalog computation
122 stars 8 forks source link

Why is DD much faster than Souffle on CRDT? #2

Open ryzhyk opened 5 years ago

ryzhyk commented 5 years ago

@frankmcsherry, do you happen to know how come DD is so much faster on this benchmark?

frankmcsherry commented 5 years ago

I think it is a few things, but not certain:

  1. Souffle can be very poor in certain joins, if not helped with planning hints.
  2. There is at least one pattern where an "argmax" is computed using quadratic tuples in Datalog, but differential uses a reduce.
  3. There is at least one place where the Datalog computes blank paths, and then restricts them to blank paths starting from non-blank points. A transformation could (in DD does) turn this in to an exploration that follows blank paths from non-blank starts (saves a quadratic factor).

There might be others, but the intent of this particular benchmark was idiom detection / reaction. Not that DD is magical at detection, but what we are seeing here are places where naive Datalog results in less efficient compute than something smarter that is either transformed Datalog, or not exactly Datalog.

frankmcsherry commented 5 years ago

Argmax: all of the laterSibling stuff, which just tries to determine which siblings can be cancelled because they can join with a sibling that is not themselves and is greater.

Blank skipping:

.decl skipBlank(From: id, To: id)
skipBlank(From, To) :- nextElem(From, To).
skipBlank(From, To) :- nextElem(From, Via), !hasValue(Via), skipBlank(Via, To).

.decl nextVisible(Prev: id, Next: id)
nextVisible(Prev, Next) :- hasValue(Prev), skipBlank(Prev, Next), hasValue(Next).

In this case, skipBlank could be developed starting from hasValue(Prev). As written above, if there is a path of blanks of length 1,000, there will be 1,000 choose 2 elements of skipBlank because any two points determine a blank path. But we only really care about 1,000 of them.

ryzhyk commented 5 years ago

Is your implementation available somewhere?

frankmcsherry commented 5 years ago

I can ship it at you. I haven't found a place to home it yet (didn't want to pollute the DD repo), but if we think of a fine place (I should probably put together a DD repo for these solutions, maybe independent of this repo, or maybe as a sub-crate/repo/wtf.

ryzhyk commented 5 years ago

Thanks! Your comment about planning hints is particularly interesting. I always thought about Souffle's ability to take planning hints as a major strength, which DD does not have right now. Isn't it the case that DD can also do quite poorly without user-defined schedules and that it could help to have something similar to Souffle's hints in the future?

frankmcsherry commented 5 years ago

My experience so far has been that Souffle can do exceptionally badly in some cases without hints. For example, in the polonius project (Rust borrow checker) there was a selfjoin roughly like

Reach(A,C) :- Reach(A,B), Reach(B,C)

whose implementation was "on change dReach to Reach, first index dReach and then iterate through Reach to find matches". This turns what would be "linear in added facts" to something more like "sum across iterations of facts so far". Tweaking the join order encourages it to maintain an index on Reach instead.

ryzhyk commented 5 years ago

For DDlog implementations of benchmarks, I think we will maintain them as tests in the DDlog repository. As far as I am concerned, there is nothing wrong with keeping DD code in this repo (dynamic-datalog) or anywhere else as long as it's reproduceable :)

ryzhyk commented 5 years ago

I used our Souffle converter (developed by @mbudiu-vmw) to conver the crdt example to DDlog and manually applied your skipBlank optimization to it. Here is the result I'm getting. Does it appear correct?

result{0,4,"!"}
result{0,4,"?"}
result{4,5,"y"}
result{5,6,"e"}
frankmcsherry commented 5 years ago

crdt.rs implementation provided with https://github.com/frankmcsherry/dynamic-datalog/commit/17a8b1f83451014ff4505d0685ed338808b54147

b-scholz commented 4 years ago

I had a quick look. This rule

skipBlank(From,To) :- nextElem(From,Via), !hasValue(Via), skipBlank(Via,To).

needs to be rewritten to this one

skipBlank(From,To) :- skipBlank(Via,To), nextElem(From,Via), !hasValue(Via).

if you want performance in Souffle.

frankmcsherry commented 4 years ago

Excellent, thanks! I'll try this out and see what the delta is.

To be honest, I'm not 100% sure if the goal of the repo is to max out each implementation, or to bake things off or what. My sort-of intent was to take "Datalog as presented" and map out some numbers, for example "differential with help" and perhaps as well "soufflé with help". But I'll certainly either leave a note or just change the Datalog.

b-scholz commented 4 years ago

Well - as we all know, current Datalog engines are not declarative when it comes to performance. Without providing appropriate hints, performance suffers severely. Hence, performance comparisons between different engines are very hard because they have different needs.

Souffle is sensitive towards rule ordering and requires these hints if you want performance.

It would be great to come up with an appropriate benchmarking methodology for logic engines. This would be great for our community.

frankmcsherry commented 4 years ago

It seems to have improved the times! I got a 10s build still, but a 294.73s second run. This is still a fair bit slower than differential, probably for the other reasons above (one could be fixed with source-to-source rewriting to move the non-blank restriction forward, but the "maximization" one is less clear to me that Datalog can easily express it).

I should stress, and I can change the language on the front page if it is painful, that soufflé is here just to give context for the "dynamic" engines. The original intent was that the other dynamic engine folks would contribute implementations, but other things seem to have come up. So it reads like a soufflé v differential bake off, but that wasn't the actual intent. >.<

I'll check the interpreted version to see if it comes in less than 1000s, and then update the repo with the changed query and updated numbers.

b-scholz commented 4 years ago

I had only a quick look at which rules can be improved immediately; at the moment we are trying to find appropriate benchmarks. Your benchmark suite is excellent!

We may optimize the codes further and forward you an optimized version in due course.

b-scholz commented 4 years ago

I made a pull-request substantially improving the runtime. See PR #3.