Endless ILP loop on Susan/tetris

mr-ma commented 5 years ago

Right after the solution is found (which takes more than an hour), another calculation starts. I'm not sure if there is a problem in the generator script or ILP solver.cpp Any thoughts @dennisfischer ?

+ 49648: mip =     not found yet <=   3.456000000e+03        (2; 0)
Time used: 3830.7 secs.  Memory used: 59.7 Mb.
# 57529: obj =   3.456000003e+03 inf =   0.000e+00 (20764) 0%
# 61560: obj =   3.456000003e+03 inf =   0.000e+00 (16733) 0%
# 65655: obj =   3.456000003e+03 inf =   0.000e+00 (12638) 0%
# 69882: obj =   3.456000003e+03 inf =   0.000e+00 (8315) 0%
# 74066: obj =   3.456000001e+03 inf =   0.000e+00 (1681) 0%
# 74941: obj =   3.456000000e+03 inf =   0.000e+00 (0) 0%
Solution found by heuristic: 3456
+ 74941: mip =   3.456000000e+03 <=     tree is empty   0.0% (0; 3)
INTEGER OPTIMAL SOLUTION FOUND
Writing basic solution to '/home/sip/eval/binaries/susan.bc/100/0/1/solution.txt'...
90040 lines were written
Writing MIP solution to '/home/sip/eval/binaries/susan.bc/100/0/1/solution_readable.txt'...
ILP resuls. max manifest.  overhead: 0.000000e+00 explicit instruction coverage: 0 implicit instruction coverage: 0
Adding 3456 manifests to protection graph
#3456/3456
Size: 26454
ILP sanity rows:312511 columns:312511 coefs:312511 ia3:4 3 2 
Writing problem to/home/sip/eval/binaries/susan.bc/100/0/1/problem.txt
Writing problem data to '/home/sip/eval/binaries/susan.bc/100/0/1/problem.txt'...
169866 lines were written
GLPK Simplex Optimizer, v4.65
51202 rows, 38830 columns, 152093 non-zeros

dennisfischer commented 5 years ago

We can see this happening for other programs such as tetris too. As mentioned in #24 this happens if there are cycles after resolving the ILP.

Example tetris: ILP started with ~11000 rows ILP is still not finished after running 400 times with now 11400 rows. Each iteration adds a found cycle as a constraint.

Ideas to speed up the solution:

~Adding all cycles was not feasible.~
Add N cycles per iteration, e.g. 100/1000

dennisfischer commented 5 years ago

Tried adding N cycles. Tetris finishes with N=1 immediately, likely indicating a bug in all_cycles.hpp cycle detection. Manifest numbers are too low.

Added output to show the number of elements in a cycle using the current approach. Expected to see decreasing number of elements. However, number oscillates without decreasing. Will give the program a bit more time to see if it eventually finishes.

dennisfischer commented 5 years ago

There is some progress, i.e., the number of elements in a detected cycle is decreasing. However, the decrease is very slow. 200 iterations (1110 elements), 300 iterations (1104 elements), 400 iterations (1102 elements).

Tetris:

cycles have a length of ~200
each iteration adds a new unique cycle
Example:
- cycle of length 10
- m1 + ... + m10 <= 9
- m1 + ... + m9 <= 8
- m1 + ... + m8 + m10 <= 8
- m1 + ... + m7 + m9 + m10 <= 8

Because in most cases only a single manifest is different we make a very small amount of progress. Enumerating all cycles is pointless (too many). Enumerating N cycles with all_cycles.hpp approach did not work and resulted in almost 0 coverage.

Ideas:

If we have two or more cycles, then create the intersection of both sets. Use the result to rebuild the manifest graph and add a cycle of smaller length.
Define the underlying cycles implicitly with constraints, i.e., m1 <= m2 <= m3 <= m1, instead of m1 + m2 + m3 <= 2

@mr-ma Thoughts?

dennisfischer commented 5 years ago

I was able to speed up the result with idea 1. In fact, all_cycles.hpp probably works as expected.

The cycle does not work well with undo constraints. Of >400 manifests it selects a lot less after pruning the problem.txt file.

problem.txt sol.txt

dennisfischer commented 5 years ago

Same behavior is observable for susan, say, toast (all of which have cycles).

mr-ma commented 5 years ago

@dennisfischer how did you handle cycles in your thesis evaluations? If I remember correctly you used a conservative approach in which all cycles are recursively detected and subsequently eliminated. Can't we do the something similar here? I.e. recursively detecting all cycles before triggering the ILP optimization. Then, we feed all the detected cycles as constraints. Does that make sense?

dennisfischer commented 5 years ago

@mr-ma I handled them exactly the same. In fact, have a look at the attached problem.txt. I've manually removed a lot of constraints, leaving us with cycles, dependency, nOf. Removing nOf produces expected results of selecting the maximum number of manifests at the cost of having hashes without asserts or asserts without hashes.

Problems started occurring after introducing the "n of m asserts" constraint (see #29 ) I see that this constraint is likely too strict. #29 proposes a solution

mr-ma / composition-framework

Endless ILP loop on Susan/tetris #23