Open magic7s opened 3 years ago
Thanks for reporting, challenging issue. I'll have a look
This is an excellent example to serve as a basis for performance improvement. The problem is simple to understand indeed. There are many ways to reduce the computation time (change the way constraints are expressed in the code, logic/tactics, multiprocess, etc.). The main problem however is that I have the feeling that the algorithmic complexity of the problem is exponential (quadratic?), I have to be sure about that. It might come from the problem itself or the implementation I decided. I also noticed a bottleneck at the problem creation, before the solver is defined. I will refer to this ticket each time I push a commit that (may) improve performances and need to be tested.
Note that recent commit https://github.com/tpaviot/ProcessScheduler/commit/5544e977ba5ad30c3726f365ad44a9e6b85aa982 should reduce computation time (and memory footprint, to be checked)
@magic7s Added the first draft of a benchmark to evaluate further fixes/refactoring/optimizations impacts.
"QF_LIA" logics seems to bring improvements
I ran my test script though Data Dog. One thing is odd is that timeit told me task 100 add resource took about 60s but DD shows only about 15s. I don't see where the other 45s comes from.
Commit 9fffa1c2882ed56dc1c447fd1d537bebbd482234 brings significant improvements
@tpaviot that made a huge difference.
Previous commit was removed, it actually did not increase performance
There have been recent performance improvements, could you please @dreinon pull the latest master branch and test your use case?
Sure thing
@magic7s @dreinon If you want the best performances, use the "QF_IDL" logics:
solver = ps.SchedulingSolver(problem, logics="QF_IDL")
What does this do?
Each SMT logics use specific ways to solve SMT problems, see https://smtlib.cs.uiowa.edu/logics.shtml
Interesting
So far, I did not find a solution to guess the best logics according to the problem (sometimes QF_IDL doesn't work because some constraints cannot be expressed as integer differences)
Adding digital_transformation.add_objective_priorities()
and digital_transformation.add_objective_makespan()
to the benchmark makes it basically unusable. Is it something I am doing wrong?
python benchmark/benchmark_dev_team_with_priority_makespan.py --nmax 30
#### Benchmark information header ####
Date: 2021-06-01 14:29:36.223333
Id: 4b575954
Software:
Python version: 3.9.5
ProcessScheduler version: 0.7.0dev
z3 version: Z3 4.8.10.0
ProcessScheduler commit number: 32cfdbf
OS:
OS: Darwin
OS Release: 20.5.0
OS Version: Darwin Kernel Version 20.5.0: Sat May 8 05:10:33 PDT 2021; root:xnu-7195.121.3~9/RELEASE_X86_64
Hardware:
Machine: x86_64
Physical cores: 2
Total cores: 4
Max Frequency: 2300.00Mhz
Min Frequency: 2300.00Mhz
Total memory: 8.00GB
-> Num dev teams: 10
Solver type:
===========
-> Solver with optimization enabled
Objectives:
======
Indicator_PriorityTotal
horizon
Total computation time:
=====================
DigitalTransformation satisfiability checked in 2726.02s
Optimization results:
=====================
->Objective priority specification: lex
->Objective values:
->Indicator_PriorityTotal(min objective): -1*oo
->horizon(min objective): -1*oo
Solver satistics:
conflicts: 2699946
decisions: 3264460
propagations: 21806508
binary propagations: 10075237
restarts: 1712
final checks: 87
added eqs: 1354036
mk clause: 2881689
del clause: 2881002
minimized lits: 1286773
num checks: 63
mk bool var: 6206
pb resolves: 78
pb conflicts: 78
pb propagations: 142825
pb predicates: 60
arith eq adapter: 4049
arith-lower: 13178760
arith-upper: 13810886
arith-propagations: 549061
arith-conflicts: 2623631
arith-bound-propagations-lp: 1181047
arith-bound-propagations-cheap: 549061
arith-diseq: 1165069
arith-gomory-cuts: 2
arith-branch: 3
arith-make-feasible: 6854150
arith-max-columns: 550
arith-max-rows: 454
arith-gcd-calls: 8
arith-cube-calls: 2
arith-cube-success: 2
arith-patches: 8
arith-patches-success: 1
arith-cheap-eqs: 12425744
eliminated vars: 21
num allocs: 117202394
rlimit count: 1398064850
max memory: 121.52
memory: 120.89
@magic7s When asking for optimization for more than one objective, the regular z3 solver optimization is used instead of incremental solver. In my experience, regular z3 optimization has been pretty slow and hasn't scaled up well, so I decided to avoid it and use a single objective. However, I can imagine lots of multiobjective problems where all objectives have to be taken into account.
Have you tried different options for solver (pareto, box, lexicon)? These are specified in the docs if you'd like to check them out.
@magic7s multiobjective optimization is widely untested so far. IMO it is another benchmark. Adding a makespan objective to the benchmark does not really makes sense since the schedule is already the best one (horizon=num_dev_team), priorities are all the same etc.
I agree with @dreinon that the regular z3 optimization solver is very slow. That's why I had to create what I called an "incremental solver". I guess I will have to adopt a similar solution for multi-objective optimization.
@magic7s I opened a dedicated issue for multi-objective optimization see #57
Thank you all for the work on this. I'm very excited to see this succeed. Of course I would prefer to use all the amazing optimizations (i.e. priority, span, single resource flow time) but I will try to use just one at a time for now.
For context, my scienerio I'm trying to solve for us 150 dev teams that need to complete a variety of tasks leveraging different pools of resources. Lets say each dev team needs to complete 5 tasks, each task has different combinations of resources, sometimes the dev team is a required resources, sometimes not. At the end I would like to produce a schedule for the resources (and dev teams) that covers the shortest time span.
@magic7s I noticed all your tasks have a duration of 1. Is it the case in your real-life problem?
@tpaviot not always, I'm using the duration of delta = week and assigning some tasks to be 1, some 2, some 3.
Another thing I found was that with an optimization enabled, the timeout is the minimum time the job will run as it will continue to try to find a better solution for that length of time. So if I set the timeout to be 600s to help ensure I can find at least one solution, every job will run for 10 minutes even if the best solution was found in 10s.
Additionally when multi optimization is used the timeout is not respected. It will run until a solution is found to the process is killed.
If you are interested I pulled out some of the identifying information and opened up the project at https://gitlab.com/bdowney/digital-transformation-schedule
logics="QF_IDL"
z3.z3types.Z3Exception: b'Benchmark is not in QF_IDL (integer difference logic).'
I'm getting this error, where I didn't previously. How can I debug this?
Try QF_UFIDL
I am struggling to scale this up, to what I think is a fairly low number. Maybe it's my code. Hoping you have some insight.
I have a simplified script that creates
n
number of tasks, assigns anA
resource and aB
resource from a pool ofA
s and a pool ofB
s. In my tests, the pool is only 2 deep, so twoA
s and twoB
s. Doing the math in my head, if I had 50 tasks then it should be 25 time intervals. Fairly simple.My issue is the processing time. For 50 tasks it took
8m17.583s
to get to an answer of 40 intervals (with a 30s timeout). For 100 tasks it took103m58.873s
and never even found a single solution (assuming because of the 30s timeout).I need to be able run this up to 5 - 10 types of tasks and 150 of each type, with some small constraints (i.e. dont overlap, this before that, etc.).
Any thoughts?
I timed each run for 10 - 100 tasks in increments of 10. Script and output here: https://gist.github.com/magic7s/35cdce42ce58f1898b9e57d80e67f28a