tpaviot / ProcessScheduler

A Python package for automatic and optimized resource scheduling
https://processscheduler.github.io/
GNU General Public License v3.0
59 stars 18 forks source link

Way to maximize resource utilization? #8

Closed dreinon closed 3 years ago

dreinon commented 3 years ago

Hey! I would like to know if there is a way to implement an objective to maximize resource utilization, since using add_indicator_resource_cost(list_of_resources) as an indicator to maximize doesn't seem to work. My aim with this is to be able to schedule the maximum of tasks possible since all of them are optional. All are optional because my problem is expected to have as input more tasks than can be done and is expected to find the best solution (or set of solutions), and if I don't mark them as optional, the output would be that the problem doesn't have a solution.

Thanks in advance :)

tpaviot commented 3 years ago

Interesting question, honestly I don't have an out-of-the-box answer to give to you. It's a use case I did not take into account yet.

I had a look at the source code, it needs a small hack to be ready for resource utilization maximization. I will the push the code to another branch, coming together with a unittest and a documentation update.

tpaviot commented 3 years ago

@dreinon you state thate doesn't seem to work. The following doesn't work ?

res_ut_indic  = problem.add_indicator_resource_utilization(list_of_resources)
MaximizeObjective('UtilizationObjective', res_ut_indic)
dreinon commented 3 years ago

Yes, but in the second line I used problem.MaximizeObjective(res_ut_indic)

tpaviot commented 3 years ago

And that do not give the expected result ?

dreinon commented 3 years ago

It didn't for me. For instance, I got more tasks scheduled when I minimize utilization than when I maximize. Though, I only tried inside my problem, so those results might have been due to other constraints. Did you try it in a simple problem and did it worked?

Thanks Thomas!

tpaviot commented 3 years ago

There is a related unittest, see https://github.com/tpaviot/ProcessScheduler/blob/6f2949bfe05de4de0c99ee5ac1756e805cd6d242/test/test_indicator.py#L89

tpaviot commented 3 years ago

Did you manage to get it done?

dreinon commented 3 years ago

Did you manage to get it done?

Yes Thomas, thank you. I made a couple more unit tests with optional tasks and maximizing utilization and it seemed to work as expected too, so I assumed it was working well and something else, probabily constraints, was causing this behavior.

Thank you again!

tpaviot commented 3 years ago

Could you please share the two additional unit tests you developed? I'll be glad to include them in the unit test suite.

dreinon commented 3 years ago

Sure thing, I'll send them to you.

Also, any differences between ps.MaximizeObjective('name',indicator) and problem.maximize_indicator(indicator)???

dreinon commented 3 years ago

I'm having issues again with maximizing and minimizing utilization.

With no use of ForceScheduleNOptionalTasks, and being all my tasks optional, using: utilization_ind = problem.add_indicator_resource_utilization(Car) ps.MinimizeObjective('MinimizeCarUtilization', utilization_ind) I get tasks scheduled when I should get 16 optional tasks scheduled when I should get none.

My actual objective is to maximize, but since I wasn't getting expected results, I tried to minimize and got this weird behaviour.

Then I found that this was happening because I had also the priorities objective, which was forcing some tasks to get scheduled. After commenting this priorities objective, in a few seconds, I got 0 tasks scheduled when minimizing utilization, as expected.

Related to maximizing, using ForceScheduleNOptionalTasks I have achieved a 98% of ocupation, which is what I would expect from maximizing. However, I get no ending from it. Even if I don't add constraints to the problem, I get no ending, when I should get 100% of occupation.

Any ideas?

dreinon commented 3 years ago

Found out that if I schedule as many tasks as slots there are, the problem with maximization and constraints gets solved in a few seconds (though due to constraints I don't get full utilization, I was expecting this behaviour). When I add one and only one more task, with all the same, same constraints, etc, the time to get a solution has scaled to 487.13s = 8 min

Is that something expected?

tpaviot commented 3 years ago

@dreinon Yes, the multiobjective feature might be a trap, because objectives may conflict. What do you mean with I get no ending?

For the second point, I also noticed that the solving time is not linear. Just adding one task or one constraint can cause the resolution time to blow up. But, basically, the task scheduling problems are nonlinear, so it's not a surprise. In many cases, adding one task or one constraint can move a problem from "easy to solve" (short solving time) to "pretty more difficult to solve" (long solving time).

tpaviot commented 3 years ago

Note for myself: check z3 "soft constraints" see https://github.com/Z3Prover/z3/issues/2216 and https://rise4fun.com/Z3/tutorialcontent/optimization

dreinon commented 3 years ago

@dreinon Yes, the multiobjective feature might be a trap, because objectives may conflict. What do you mean with I get no ending?

For the second point, I also noticed that the solving time is not linear. Just adding one task or one constraint can cause the resolution time to blow up. But, basically, the task scheduling problems are nonlinear, so it's not a surprise. In many cases, adding one task or one constraint can move a problem from "easy to solve" (short solving time) to "pretty more difficult to solve" (long solving time).

With I get no ending, I mean it doesn't get solved in one hour, which is what I have as maxtime. About what you were saying of non-linearity, I agree. What makes me doubt is that there's a big step between giving the problem tasks whose sum of durations are less or the same as the horizon, and giving one and only one more task.

For example, imagine a problem with horizon 20 and only one resource.

I'm not super familiarized with these kind of problems, probably you have way more experience with them, but I find this time step weird.

Also, I don't think the constraints are very significant in this step, because the change of the number of constraints according to the number of tasks grows linearly (aprox.), and if I remember well, I get this big timestep if I set no constraints.

A doubt I have is: Shouldn't the more constrained the problem is, the faster it gets solved? Since it has less combinations to try, I'm wondering about that.

Thanks!

dreinon commented 3 years ago

Btw, I tried solving the same problem that I mentiones before that took me 8 minutes in a machine with way more computing power, and got the same timing, 8 minutes.

My first machine was a 4 vCPU, 8gb RAM, and the second 24 vCPU, 16 gb RAM.

Any ideas of how this is possible?

Thanks again!

tpaviot commented 3 years ago

By default, the z3 solver runs in single-task mode. So the total number of CPUs does not have any impact. You could try the same problem with parallelization enabled (see https://github.com/tpaviot/ProcessScheduler/blob/35780ab30c7cba7057a5f68f5530a99fdd73f8be/processscheduler/solver.py#L36), just pass the parallel=True boolean to the solver instantiation (the maximum number of concurrent threads is hard-coded to 4 in ProcessSchduler). Some problems proved to be solved faster using parallel computation, some others did not.

dreinon commented 3 years ago

Oh okay! Then 4 threads would be the maximum power I could be using right?

More than 4 threads would be wasting money haha

tpaviot commented 3 years ago

Yes, so far 4 is the maximum. If ever you notice a lower solinvg time, I can commit a change where you can set the maximum number of threads.

tpaviot commented 3 years ago

It's worth trying the parallel solving option. After a few tests on my machine, it can divide the solving time by a factor 2.

By the z3 author (https://github.com/Z3Prover/z3/issues/2207)

Threads should be used after some initial rounds of solving. Depending on the problem size it can
take some minutes. By default it doubles the number of threads every time it makes a cubing decision.
tpaviot commented 3 years ago

@dreinon Instead of finding a maximum, thus using an optimisation solver, you can also use a non-optimized solver and check a solution for a "minimum utilization":

utilization_ind = problem.add_indicator_resource_utilization(Car)
pb.add_constraint(utilization_ind > 99)  # for a minimum of 99%

It will be faster than maximize objective that look for the best possible value. Or, if you need an utilization of 100%:

utilization_ind = problem.add_indicator_resource_utilization(Car)
pb.add_constraint(utilization_ind == 100) ## the double equal is important, it's not a comparison in the z3 semantics
dreinon commented 3 years ago

Awesome!

I'll give it a try and get back to you with news :)

Thank you!!

tpaviot commented 3 years ago

The previous syntax is not good, the correct one is:

utilization_ind = problem.add_indicator_resource_utilization(Car)
pb.add_constraint(utilization_ind.indicator_variable == 100)

I confirm it is much faster (2 or 3 orders of magnitude according to my tests)

dreinon commented 3 years ago

Just tried constraints.append(utilization_ind.indicator_variable >= 95) and I get No solution exists for problem. I know there's at least a solution with utilization 98, I found it by forcing n optional tasks to be scheduled. Any ideas about why this could be happening?

Thanks!

tpaviot commented 3 years ago

and what gives constraints.append(utilization_ind.indicator_variable ==98)?

dreinon commented 3 years ago

No solution again :(

Edit: My fault, I was missing something. Found solution with >= 98 in 50 secs.

tpaviot commented 3 years ago

Weird. And ==99 or ==100?

tpaviot commented 3 years ago

Be sure that you dont have any Maximize or Minimize anywhere, to be sure you don't run an optimization solver

dreinon commented 3 years ago

Weird. And ==99 or ==100?

Sorry, my fault, read edit in last comment.

tpaviot commented 3 years ago

ah ok, so it works for >=95 ?

dreinon commented 3 years ago

Yes, for >= 98.

Trying ==100 but taking long, I'll leave it and get back with news!

Thank you :)

tpaviot commented 3 years ago

Did you have any result?

dreinon commented 3 years ago

It was taking so long that is was not reasonable, since we need a faster solution.

Then, I tried solving for >= 95 and these are my results:

tpaviot commented 3 years ago

ok, thank you for this instructive feedback. Just a couple of questions related to your project: what do you call a "reasonable time"? How many tasks/constraints/type of constraints do you use to represent your problem?

dreinon commented 3 years ago

About timing:

About tasks:

About constraints:

EDIT

About constraints:

About resources:

tpaviot commented 3 years ago

And what about the resources?

dreinon commented 3 years ago

And what about the resources?

Check last edit.

tpaviot commented 3 years ago

ok thanks

tpaviot commented 3 years ago

I guess that the for j in availability: loop is nested into another one, right?

tpaviot commented 3 years ago

what could be cool is that we build the smallest test case that represents your problem, no need to have all the semantics, just the core that would enable further investigation. Especially on the solving parts, there are plenty of z3 options that could be tested in order to reduce the total computation time. There may be a bottleneck somewhere.

dreinon commented 3 years ago

I agree Thomas.

Regarding to the loop, this would be the pseudo-code: image

EDIT

Regarding to the comment saying second loop, I meant same loop for tasks with duration 1 (not ctr1)

dreinon commented 3 years ago

I will work in that smallest example and get back to you when I have sth done.

Thanks!

tpaviot commented 3 years ago

ok perfect, indeed you're the guy for this job

dreinon commented 3 years ago

ok perfect, indeed you're the guy for this job

Made edit in the photo

tpaviot commented 3 years ago

It is strange that the parallel option has no impact on the computation time. Which os are you running?

dreinon commented 3 years ago

I'm using a Google Cloud Platform's AI Notebook which is basically a Debian instance that runs Jupyter Lab, Anaconda, and some of the most popular packages for science and ML.

tpaviot commented 3 years ago

You can try a decremental solving, by not using any optimization solver. For example:

solver = ps.SchedulingSolver(problem, max_time=15)
smt_solver = solver._solver
solution = False

target_utilization = 100
while not solution:
    smt_solver.push()
    print("Target utilization:%i%%" % target_utilization)
    smt_solver.add(utilization_ind.indicator_variable >= target_utilization)
    solution = solver.solve()
    smt_solver.pop()
    target_utilization -= 1

That means the solver as 15s to find a solution for 100%, then 15s to find a solution that satisfied 99% of resource utilization etc. Down to 0. You can set the max_time to give the solver a few more time finding a suitable solution.

tpaviot commented 3 years ago

The output for the previous example is:

Target utilization:100%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:99%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:98%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:97%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:96%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:95%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:94%
Example Satisfiability checked in 15.00s
No solution can be found for problem Example because: canceled
Target utilization:93%
Example Satisfiability checked in 11.42s
dreinon commented 3 years ago

Working correctly for my problems!

A good option to avoid optimization.

dreinon commented 3 years ago

I close this issue until there is another quicker way to address resource utilization optimize than drecemental solving.