Open helloqirun opened 3 years ago
@helloqirun , thanks for reporting the issue. That the timestamp is small suggests that ddlog spends this entire time instantiating the enormous dataflow graph with thousands of rules and relations. In addition, 0.21s is a lot of time to perform a trivial computation, which likely means that differential dataflow also struggles with this huge graph at runtime.
I see that @Kixiron self-assigned this issue, so he may have a more detailed explanation soon.
Just curious: how did you come up with this program? It looks like auto-generated code :)
Thanks @ryzhyk and @Kixiron ! Yes, the .dl file is auto-generated. It essentially contains a Dyck language of 1600+ parentheses, defined by the grammar "S-> SS | (_i S )_i | empty". I was using souffle and switched to ddlog because it supports incremental updates.
I am not familiar with dyck, but here is a equivalent program that adds an idx
column to A
and B
to model 1600 individual relations used in your code. This should be way more efficient.
typedef node = u32
input relation RA(idx: u16, s: node, t: node)
input relation RB(idx: u16, s: node, t: node)
input relation EMPTY(s: node, t: node)
output relation S(s: node, t: node)
S(x, y) :- EMPTY(x, y).
S(x, y) :- S(x,z), S(z,y).
S(x, z) :- RA(i, x, a), S(a, b), RB(i, b, z).
This is a really interesting case that I'm not fully done investigating so I'll update whenever I finish looking into it. From what I can see now, this looks like a really interesting and strangely pathological case due to arrangements, there's hundreds of thousands of arrangement-related events produced by the code but very few timely events which is super interesting. Again, not finished looking into things so I'll update y'all when I have more info
I am not familiar with dyck, but here is a equivalent program that adds an
idx
column toA
andB
to model 1600 individual relations used in your code. This should be way more efficient.typedef node = u32 input relation RA(idx: u16, s: node, t: node) input relation RB(idx: u16, s: node, t: node) input relation EMPTY(s: node, t: node) output relation S(s: node, t: node) S(x, y) :- EMPTY(x, y). S(x, y) :- S(x,z), S(z,y). S(x, z) :- RA(i, x, a), S(a, b), RB(i, b, z).
Thanks. This one runs much faster!
Hi,
I have a quick question about the ddlog performance. Please see the following.
I have a .dl file that contains 4000+ lines. But the .dat file is fairly small. Please see the test files at https://github.com/helloqirun/ddlog_test
The timestamp is very small (0.21 seconds), as expected. However, the overall running time for dyck_cil is a bit long (~19 seconds).
Is the long-running time for dyck_cil an expected behavior? Thanks!