Boolean variables when using exact and abstract

marko-vasic commented 4 years ago

Hi,

I have encountered following surprising case in Alloy (see code bellow): When using the exact scope size number of boolean variables is more than when using up to, while one would expect the opposite. This behavior goes away when removing abstract keyword (see example bellow). Does anyone have idea why is this happening?

=============== abstract sig S {}

sig S1 extends S {}

pred P() { some S }

exact: run P for exactly 2 S upto: run P for 2 S

pkriens commented 4 years ago

What do you mean with boolean variables?

Kind regards,

Peter Kriens

On 13 May 2020, at 17:13, marko-vasic notifications@github.com wrote:

boolean variables

marko-vasic commented 4 years ago

Thank you for the prompt response. After executing the previous commands in Alloy, this is the output: Executing "Run exact for exactly 2 S" Solver=minisat(jni) Bitwidth=0 MaxSeq=0 Symmetry=20 47 vars. 6 primary vars. 64 clauses. 4ms. . found. Predicate is consistent. 17ms.

Executing "Run upto for 2 S" Solver=minisat(jni) Bitwidth=0 MaxSeq=0 Symmetry=20 6 vars. 2 primary vars. 4 clauses. 2ms. . found. Predicate is consistent. 3ms.

We can see that there is more primary vars when executing with exactly, than upto bounds, which is opposite than what is expected. The situation changes when removing abstract keyword.

Hope this clarrifies.

pkriens commented 4 years ago

I am stepping outside my area of expertise here ... so don't take me too serious.

I am assuming that the 'exactly' needs more constraints to enforce the exactly. Look at he log:

For exactly:

Executing "Run exact for exactly 2 S" Sig this/S scope <= 2 Sig this/S1 scope <= 2 Sig this/S in [[S$0], [S$1]] with size==2 Sig this/S1 in [[S$0], [S$1]]

Executing "Run upto for 2 S" Sig this/S scope <= 2 Sig this/S1 scope <= 2 Sig this/S in [[S$0], [S$1]] Sig this/S1 in [[S$0], [S$1]]

The only difference is the exact case has the constraint 'size==2'. Seems logical it requires more clauses in the cnf?

I agree, the difference seems a bit excessive. However, it might be caused by the fact that it is an integer constraint. Integers tend to use a lot of bits in.

You could single step through the kodkod code of course ...

That said, I am out of my league here ...

On 13 May 2020, at 21:24, marko-vasic notifications@github.com wrote:

Thank you for the prompt response. After executing the previous commands in Alloy, this is the output: Executing "Run exact for exactly 2 S" Solver=minisat(jni) Bitwidth=0 MaxSeq=0 Symmetry=20 47 vars. 6 primary vars. 64 clauses. 4ms. . found. Predicate is consistent. 17ms.

Executing "Run upto for 2 S" Solver=minisat(jni) Bitwidth=0 MaxSeq=0 Symmetry=20 6 vars. 2 primary vars. 4 clauses. 2ms. . found. Predicate is consistent. 3ms.

We can see that there is more primary vars when executing with exactly, than upto bounds, which is opposite than what is expected. The situation changes when removing abstract keyword.

Hope this clarrifies.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AlloyTools/org.alloytools.alloy/issues/118#issuecomment-628196319, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABQ6LSBQVJ3OFTV7HVWIILRRLXVVANCNFSM4M725LZQ.

aleksandarmilicevic commented 4 years ago

Marko, you are right that using "exact" instead "up to" scopes generally leads to fewer primary variables. That is because "exact" scopes, in most cases, can be enforced by tightening the bounds (and not by adding integer constraints). Consider this model

sig X { x: one X }

exact: run {} for exactly 2 X
upto: run {} for 2 X

Both exact and upto commands are translated to exactly the same Kodkod formula. The bound for X, however, is different:

for exact: this/X: (2, 2) :: [[[X$0], [X$1]]]
for upto: this/X: (0, 2) :: [[], [[X$0], [X$1]]]

Finally, encoding the fact that X must be exactly a set of 2 elements can be done with fewer variables that encoding that X can be any set "between" an empty set and a set of 2 elements.

Now, things become more complicated in presence of subtypes and abstract sigs. In your example, exactly 2 S cannot be enforced by settings an exact bound for S because no Kodkod relation is created for the abstract sig S! Instead, exactly 2 S is enforced with an additional constraint:

(some v3: this/S1, v2: this/S1 | 
  (v3 + v2) = this/S1 && 
  no (v3 & v2) && 
  true) &&

You could say that when there is a single abstract sig and a single non-abstract subtype of that sig, setting an exact scope for the abstract sig is equivalent to setting that same exact scope for that sub-sig; while that is true, the Alloy Analyzer does not attempt to make that inference for this corner case and instead it implements a generic mechanism that works for cases with multiple non-abstract sub-sigs.

To read more about Alloy subtypes, see this paper:

Jonathan Edwards, Daniel Jackson, Emina Torlak, and Vincent Yeung. Faster constraint solving with subtypes. International Symposium on Software Testing and Analysis (ISSTA), 2004.

marko-vasic commented 4 years ago

Thank you for your prompt responses Aleksandar and Peter!

The bigger picture behind my post is that I have a larger model in which when using "exact" bounds (say exactly 5) time to enumerate all solutions is much larger than when using "up to" bounds (which include all sizes between 0 and 5, including 5). This looked quite surprising and a bit illogical to me. From the model I extracted the minimal code that reflects such behavior which is what I posted.

As I do not deeply understand Kodkod and Alloy internals, I will understand your response as that there are some optimizations that Alloy occasionally performs, because of which we can encounter surprising situations in which code that we thought will run faster does not.

In your example, exactly 2 S cannot be enforced by settings an exact bound for S because no Kodkod relation is created for the abstract sig S!

I am not sure if this is relevant to what you are saying, but including a relation in the model I sent (s: one S) would again produce more primary variables with "exact" bounds.

You could say that when there is a single abstract sig and a single non-abstract subtype ... the Alloy Analyzer does not attempt to make that inference for this corner case and instead it implements a generic mechanism that works for cases with multiple non-abstract sub-sigs.

Again, I am not sure if this is relevant, but even with multiple sub-sigs, "exact" bounds will have more primary variables.

aleksandarmilicevic commented 4 years ago

including a relation in the model I sent (s: one S) would again produce more primary variables with "exact" bounds.

Irrespective of the rest of the model, no relation is created for abstract sigs (and hence the exact scope on such an abstract sig cannot be translated to a bound, which is why the final formula ends up having more variables).

even with multiple sub-sigs, "exact" bounds will have more primary variables.

That's exactly what I meant to say: because of the case with multiple sub-sigs of an abstract sig, an exact scope for the abstract sig cannot be encoded as a bound but instead it must be encoded as a constraint (and therefore there will be more primary variables). All I said was that in the special case when there is only one sub-sig of an abstract sig, Alloy could be smarter and figure out how to translate that exact scope into a tight bound on the sub-sig, but that's not what the Alloy Analyzer currently does.

there are some optimizations that Alloy occasionally performs, because of which we can encounter surprising situations in which code that we thought will run faster does not.

I think that's a fair statement in general. In this particular case, I tried to explain why exactly specifying an exact scope for an abstract sig actually makes the problem more difficult.

marko-vasic commented 4 years ago

I think I understand it now, thanks!

AlloyTools / org.alloytools.alloy

Boolean variables when using exact and abstract #118