vmware / differential-datalog

DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.
MIT License
1.37k stars 118 forks source link

Trying to parse non-DNF disjunctions #244

Closed Argc0 closed 5 years ago

Argc0 commented 5 years ago

I was trying myself somes examples of disjunction of non-DNF form. An examples is:

.decl AMethod(?simplename:symbol, ?descriptor:symbol, ?type:symbol, ?method:symbol)
.decl BMethod(?class:symbol, ?superclass:symbol)
.decl CMethod(?ref:symbol, ?interface:symbol)

AMethod(?simplename, ?descriptor, ?type, ?method) :-
    AMethod(?simplename, ?descriptor, ?supertype, ?method),
    (BMethod(?type, ?supertype) ; CMethod(?type, ?supertype)).

So, I started from uncommenting the line from the grammar rule in souffle-grammar.pg:

Term: Literal
    | "!" Term
    | "(" Body ")" // TODO
    ;

When I was trying to simple run the souffle_python.py, to see if the example can pass the parglare an error appeared:

...
In state 74:FunctionCall and input symbol ')' can't decide which reduction to perform: '133: Literal = FunctionCall' or '107: Arg = FunctionCall'.
Traceback (most recent call last):
  File "souffle_converter.py", line 1400, in <module>
    main()
  File "souffle_converter.py", line 1396, in main
    convert(args.input, args.out, args.prefix, args.d)
  File "souffle_converter.py", line 1354, in convert
    parser = getParser(debug)
  File "souffle_converter.py", line 114, in getParser
    parglareParser = parglare.Parser(g, build_tree=True, debug=debugParser)
  File "/home/argc0/.local/lib/python2.7/site-packages/parglare/parser.py", line 88, in __init__
    self._check_parser()
  File "/home/argc0/.local/lib/python2.7/site-packages/parglare/parser.py", line 118, in _check_parser
    raise RRConflicts(unhandled_conflicts)
parglare.exceptions.RRConflicts: 
Reduce/Reduce conflicts in following states: set([74])

It seems that when I uncomment the line obove the grammar rule Literal and Arg are conflicted for the reduction of FunctionCall.

I do not understand why exactly this error appeared, but I solved it temporarily by changing the rule:

Literal : Arg Relop Arg
        | Atom
        | FunctionCall
        | TRUE
        | FALSE
        ;

to:

Literal : Arg Relop Arg
        | Atom
        | Arg
        | TRUE
        | FALSE
        ;

and it pass the parglare parser, but I am not sure how fair this change would be and I am trying to find a way so the grammar you wrote won't need a lot of changes, so at least can accept parentheses with or without disjunction.

ryzhyk commented 5 years ago

@mbudiu-vmw should be able to help with the parser question, but I am a bit confused as to what you are trying to accomplish. Even if the parser can recognize parentheses, disjunction, etc., DDlog does not support them at the moment (we are hoping to add support for arbitrary Boolean structure in the rules soon). It seems, the only situation where parsing parenthesis could be useful is in Boolean expressions like (x = 5; y > z), since DDlog does support those.

gfour commented 5 years ago

@ryzhyk Doop contains many rules like the following:

_MethodLookup_WithLen(?simplename, ?descriptor, ?type, ?method, n + 1) :-
    (DirectSuperclass(?type, ?supertype) ;
     DirectSuperinterface(?type, ?supertype)),
    _MethodLookup_WithLen(?simplename, ?descriptor, ?supertype, ?method, n),
    !MethodImplemented(?simplename, ?descriptor, ?type, _).

These are not in DNF form but can be converted by a preprocessor in the Souffle converter that feeds the converted DNF rules to the rest of the conversion process (@Argc0 is working on that). I think this preprocessor would also address #98.

ryzhyk commented 5 years ago

Understood. We considered doing the same, but held back, since we plan to support disjunctions in DDlog eventually; however if it is an important feature for you, I agree it makes sense to have at least a temporary implementation in the converter. Again, @mbudiu-vmw should be able to recommend the best way to do it.

Note, for performance, you do not want to compile:

X :- (A;B),C,D.

into

X :- A,C,D.
X :- B,C,D.

but rather

Y :- A.
Y :- B.
X :- Y,C,D.
mihaibudiu commented 5 years ago

Fixing the grammar is easy. Also, the souffle_converter could itself make the conversion. However, we think that in general converting to DNF could impact performance significantly, so the plan is instead to directly add support for "OR' in DDlog, which could natively be supported by the differential dataflow engine on top of which we run.

So whether we implement conversion to DNF or direct support of ORs depends mostly on how soon you need this feature.

mihaibudiu commented 5 years ago

BTW: our grammar is based on an older version of your souffle bison grammar, which has unfortunately been modified in the meantime. To parse disjunction without conflicts we would have to just convert the bison grammar to an equivalent form in Parglare.

gfour commented 5 years ago

@ryzhyk Native support for OR would be welcome of course! We can implement this preprocessor as a temporary workaround to experiment more with DDlog, until proper implementation comes.

I don't know much about the performance implications of duplicating rules in DDlog. It seems that in Doop, sometimes, instead of materializing intermediate results (that would need a relation and 2-3 very simple rules to fill that relation), the rules are written inline with an OR in the body of another rule, resulting in a non-DNF rule body. I see it mostly as a matter of conciseness/style, why is it bad for performance?

@mbudiu-vmw It seems that modifying the Parglare grammar to be in sync with latest Souffle parser.yy is needed in either case (either to feed non-DNF rules to our preprocessor workaround, or to feed them to a future translator targeting the proper implementation). So we continue experimenting with Parglare and will report back with more information.

mihaibudiu commented 5 years ago

For now I wasn't planning to refactor the grammar; that may be a lot of work.

If the grammar accepts roughly the same language we should be fine for a while. But I will make the grammar accept the Souffle programs including OR, but for now I will fail when converting programs that do not have ORs at the top-level.

mihaibudiu commented 5 years ago

The grammar will hopefully be fixed today. No ETA for the disjunction support in ddlog, though.

ryzhyk commented 5 years ago

I don't know much about the performance implications of duplicating rules in DDlog. It seems that in Doop, sometimes, instead of materializing intermediate results (that would need a relation and 2-3 very simple rules to fill that relation), the rules are written inline with an OR in the body of another rule, resulting in a non-DNF rule body. I see it mostly as a matter of conciseness/style, why is it bad for performance?

@gfour, I did not mean to say that this is bad for performance. I was just suggesting that there are different ways to compile such rules and the more efficient one is probably going to be the one that introduces an intermediate relation for the disjunction (see my comment above). Hope this makes sense :)

mihaibudiu commented 5 years ago

https://github.com/ryzhyk/differential-datalog/pull/247

ryzhyk commented 5 years ago

@Argc0, @gfour, @mbudiu-vmw, ok to close this issue?

gfour commented 5 years ago

Yes, we can close this issue. Related PR: #255 (a converter for non-DNF patterns found in Doop).