Closed madmike200590 closed 1 year ago
Current status: Programs can be reified and written to stdout using CLI flag -reify
, next steps:
Base: 70.58% // Head: 66.83% // Decreases project coverage by -3.74%
:warning:
Coverage data is based on head (
28ab2bb
) compared to base (b8af1d7
). Patch coverage: 13.93% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
TODO:
We should map constant term types to "ASP-types" integer, string, symbol, object rather than just writing java types (see instances of constantTerm_class/2
in reified code)
@madmike200590 Working with the reification a bit, I noticed that some additional information on terms (and probably atoms) is always needed. Deriving this on the reified facts is possible, but I think it needlessly slows down any evaluation as the information is already stored in the internal representation and computing it based on the reified facts involves a lot of join-evaluations.
What I think should be part of every reification:
term_is_ground(T)
.functionTerm_numArguments(FT,Num)
.The following also may be useful:
atom_is_ground(A)
.@AntoniusW is there anything you think needs improving and/or changing on this? Otherwise I'd like to go ahead and merge.
@AntoniusW is there anything you think needs improving and/or changing on this? Otherwise I'd like to go ahead and merge.
I haven't had the time to review the code. Will do so in the next days and I assume that most of it is fine already.
Since the PR description is a very good documentation of the reification itself already, I suggest to add a clear explanation of what happens with facts in the input program. The example covers the case of a rule and from the description it is not clear whether facts are represented just like rules (i.e., with a rule_head
and a normalHead_atom
fact), or by some shorthand, which seems to be the case (i.e., for a fact we get fact(ID)
and then atom_type(ID,basic)
and then basicAtom_...
but no rule_head
etc).
Working a bit more with the reification I noticed that every occurrence of each and every term gets its own id. This makes writing rules extremely cumbersome as, for example, all occurrences of the same variable X
in a rule receive different term ids. The behavior I expected, is already implemented for predicates (i.e., a lookup-table of ids for known predicates). Is there a particular reason why for terms this approach was not chosen?
Working a bit more with the reification I noticed that every occurrence of each and every term gets its own id. This makes writing rules extremely cumbersome as, for example, all occurrences of the same variable
X
in a rule receive different term ids. The behavior I expected, is already implemented for predicates (i.e., a lookup-table of ids for known predicates). Is there a particular reason why for terms this approach was not chosen?
As per our discussion on that topic, this leads to a not-so-straightforward question of whether a term id refers to a term symbol or the abstract object the term stands for. I went with the former because it seemed more intuitive to me. I'd like to keep it as-is in order to finally get this feature finished.
@AntoniusW I've tried to add some more test cases for at least most basic program constructs. These are far from exhaustive, but still took a long time to write (since programmatically verifying reification results is rather involved). In order to move forward with this feature, I'd like to get it merged as it is now. Even if there should still be some bugs lurking, we won't be breaking anything, given that reification is for now an experimental feature and nothing depends on it.
The intention of this PR is to provide functionality to reify (i.e. encode as ASP facts) input programs. These reified programs can be written to standard out for further use, or used by Alpha itself for static analysis of input programs.
Program reification
Reified programs are written to stdout using CLI option
-reify
.Reification of ASP language constructs
Programs
A program consisting of facts
F_1, ... F_n
and rulesR_1, ... , R_k
will be represented using atoms of formfact(ID)
andrule(ID)
, respectively.ID
is a unique identifier for the (fact-)atom or rule in question.Rule Heads
Rule heads are represented using the predicate
rule_head/2
, i.e. for each rule head, an atom of formrule_head(R_ID, H_ID)
exists in the reified program, whereR_ID
is a unique id for the rule andH_ID
a unique id for the head. An additional atom representing the type of head (normal or choice) is of formhead_type(H_ID, TYPE)
where type is eithernormal
orchoice
.Normal Heads wrap an atom, which is referenced using a fact of form
normalHead_atom(H_ID, A_ID)
, withA_ID
being a unique identifier for the head atom.Choice Heads consist of lower and upper bound terms represented as
choiceHead_lowerBound(H_ID, LBT)
andchoiceHead_upperBound(H_ID, UBT)
, respectively. A choice head consists of one or more choice elements which are referenced usingchoiceHead_element(H_ID, E_ID)
andchoiceHead_numElements(H_ID, NUM)
.Each choice element consists of one atom and an arbitrary number of condition literals and is represented using
choiceElement_atom(E_ID, A_ID)
,choiceElement_conditionLiteral(E_ID, L_ID)
andchoiceElement_numConditionLiterals(E_ID, NUM)
.Constraints
For constraints, instead of a
rule/1
instance, an atom of formconstraint(ID)
is generated. In every other respect, a constraint is just a rule without a head, i.e. there is no difference in reification of body literals.Rule Bodies
Every body literal of a rule referenced by id
R_ID
is represented asrule_bodyLiteral(R_ID, L_ID)
. The number of body literals is encoded usingrule_numBodyLiterals(R_ID, NUM)
.Literals
Every literal is uniquely identified using an identifier
L_ID
and encoded using factsliteral_polarity(L_ID, P)
, whereP
is eitherpos
orneg
, andliteral_atom(L_ID, A_ID)
whereA_ID
references the atom wrapped by the literal.Atoms
For every atom, there is a fact
atom_type(A_ID, TYPE)
, whereTYPE
is the type of the atom in question:basic
,comparison
,external
oraggregate
. Further facts depend on the type of atom that is reified.Basic Atoms are encoded using
basicAtom_predicate(A_ID, P_ID)
to represent the predicate of the atom andbasicAtom_numTerms(A_ID, NUM)
as well asbasicAtom_term(A_ID, IDX, T_ID)
for the terms of the atom.IDX
refers to the position in the argument list of the term uniquely identified byT_ID
.Comparison Atoms are reified to
where a comparison operator
OP
is one ofeq
,ne
,le
,lt
,ge
,gt
.The reified representation of an external atom is
Aggregate Atoms are represented using facts
aggregateAtom_leftHandTerm(A_ID, T_ID)
,aggregateAtom_rightHandTerm(A_ID, T_ID)
,aggregateAtom_leftHandOperator(A_ID, OP)
andaggregateAtom_rightHandOperator(A_ID, OP)
for left- and right-hand terms and operators. The aggregate function for a specific literal is given byaggregateAtom(A_ID, FUNC)
, with func beingcount
,sum
,min
ormax
. Each aggregate element is encoded asaggregateAtom_aggregateElement(A_ID, E_ID)
and has a unique identifierE_ID
. For each aggregate element, the numbers of terms and literals are given asaggregateElement_numTerms(E_ID, NUM)
andaggregateElement_numLiterals(E_ID, NUM)
. Each element term and its position are referenced asaggregateElement_term(E_ID, IDX, T_ID)
while literals are encoded asaggregateElement_literal(E_ID, L_ID)
.Terms
Every reified term has a fact
term_type(T_ID, TYPE)
specifying the term type (constant
,variable
,arithmetic
,function
).Constants have a type and value:
constantTerm_type(T_ID, C_TYPE)
andconstantTerm_value(T_ID, VAL)
whereC_TYPE
is eithersymbol
for a string represnting a symbolic constant,integer
for an integer constant,string
for a string constant, orobject(JAVA_TYPE)
for a Java object, withJAVA_TYPE
represeting the value ofterm.getObject().getClass().getName()
. The valueVAL
of a term is always the string representation of the respective constant.Variable Terms are encoded using their variable symbol as a string:
variableTerm_symbol(T_ID, SYM)
.Arithmetic Terms are reified as:
Function Terms are encoded using atoms
functionTerm_symbol(T_ID, SYM)
,functionTerm_numArguments(T_ID, NUM)
and one atomfunctionTerm_argumentTerm(T_ID, IDX, ARG_ID)
per argument term.Facts
Facts are encoded as regular atoms (i.e. using
atom_type
, etc predicates), but in addition result in an atomfact(<atomId>)
, denoting that the atom with id<atomId>
is a fact.Example
Consider the following program
The reified version of this program is: (comments inserted manually, not generated by Alpha)