fdabrandao / vpsolver

Arc-flow Vector Packing Solver (VPSolver)
http://vpsolver.fdabrandao.pt
BSD 3-Clause "New" or "Revised" License
98 stars 26 forks source link

Explanation on bin packing result generation time #14

Open S1LV3RJ1NX opened 1 year ago

S1LV3RJ1NX commented 1 year ago

Hello @fdabrandao , I have a doubt regarding time it takes to generate bin packing results. I am currently using gurobi solver. The experiment setup is as follows: I have 8K - 4D bins and close to 10K tasks (4D) . I am basically trying to simulate a cloud workload. In that workload task requests arrive anywhere in the range of 10-200. The scenario is as follows:

Thanks!

fdabrandao commented 1 year ago

Hi @S1LV3RJ1NX,

The performance is tied substantially to the size of graph replicating all valid patterns. One thing that has most impact is the number of items that fit in each bin. When there are many dimensions, multiple bin types, big capacities, long patterns (many items fit in a single bin) it is hard to create a compact representation of every valid packing pattern. Note that adding a cardinality constraint (e.g., maximum of 3 or 4 items per bin) may eventually help reducing the size of the graph enough to find a solution.

One thing that may be useful in that heuristic methods tend to work well on the instances in which it is hard to solve exactly with VPSolver, since for heuristics the bigger set of possible patterns typically helps finding good solutions. For VPSolver the larger number of valid packing patterns makes it hard to generate a model small enough to be possible to solve.

S1LV3RJ1NX commented 1 year ago

Thanks @fdabrandao, for the explanation! There were cases when VPSolver was able to find bin-packing solutions immediately. Like I had a sample trace with 100 tasks and ~250 bins, 100 tasks were given all at once. I got the solution within a minute. How can I explain those? Is there a mathematical way to say based on these many bins and these many tasks it will take so and so amt of time?

fdabrandao commented 1 year ago

What was the maximum number of tasks in a single bin in the optimal solution? If you could provide the instances it should be easy to see why some are easy and some are hard. Jusf having small items in an instance can make it substantially harder for VPSolver as the number of valid packing patterns grows really fast. Groping small tasks in a bigger task would be a way to reduce the graph size.

S1LV3RJ1NX commented 1 year ago

Hello @fdabrandao , I am attaching few csv files for reference which I am using for VPSolv. Check only cpu, mem, disk and net cols from these files.

Here I am using ms_general_100_baseline_test.csv to be placed all at once on vm_alibaba_baseline.csv. Bin cost are given by VM cost in the column. Here I get solun very quickly.

For another experiement, I am trying to place google_tasks_10k_test_new_ts.csv approx. 10k items on to google_vms_10k_max_load.csv Since here I have 10k items, I am placing them in groups of 16 at once. If I try with groups of 64 or above it takes a lot of time to generate the solution. Sometimes even after 7~10 hrs VPSolver is still finding the solution. Again bin cost is one mentioned as VM cost in CSV.

fdabrandao commented 1 year ago

For the first experiments it looks like patterns should be very short. How many tasks were placed on each machine?

For the second one, I believe you sent google_vms_10k_max_load.csv twice so I could not see the tasks. However, with that amount of items, even if two or three tasks can be placed on each machine, the graph representing the patterns can get quite big.

S1LV3RJ1NX commented 1 year ago

@fdabrandao apologies for that, I have updated the csvs please check.

fdabrandao commented 1 year ago

In the first dataset you have big items such as:

cid ts cpu mem disk net duration
c_15577 180 5 13 150 16 42990
c_83876 720 5 13 150 16 31770
c_62178 720 5 14 150 16 13380
c_52198 720 2 9 60 1 13320
c_44172 720 2 12 60 1 12810
c_93506 720 4 12 120 2 40470

for machines such as:

mid cpu mem disk net cost
v_235 8.0 32.0 240.0 20.0 0.326
v_236 8.0 32.0 240.0 20.0 0.326
v_237 8.0 32.0 240.0 20.0 0.326
v_238 8.0 32.0 240.0 20.0 0.326
v_239 8.0 32.0 240.0 20.0 0.326

It looks like most solutions will have one or two tasks assigned to each machine for the first test. Therefore, the patterns can be reasonably short.

In the second dataset you have many tiny items:

cpu mem disk net duration cid ts
1 1 1 1 1285 c_0 603
1 1 0 1 151 c_1 603
2 1 1 1 5472 c_2 603
2 1 1 1 32353 c_3 603

for big machines:

mid cpu mem disk net cost
v_0 4 16 120 10 0.186
v_1 4 16 120 10 0.186
v_2 4 16 120 10 0.186
v_3 4 16 120 10 0.186
v_4 4 16 120 10 0.186
v_5 4 16 120 10 0.186

...

This results in very long patterns that are very hard to represent in a compact way. For this type of dataset you can use an assignment-based model more effectively. You can learn about the graph generation method at https://research.fdabrandao.pt/papers/PhDThesisBrandao.pdf and the assignment-based models are also mentioned there.

S1LV3RJ1NX commented 1 year ago

@fdabrandao Thank you for your suggesstions these are very helpful. Can you please recommend some github repos for using assignment based models like VPSolver?

S1LV3RJ1NX commented 1 year ago

@fdabrandao I had a unique observation can you please explain me the reasoning behing it? If I take 100 samples form first dataset, and around 267 bins a mix of each categories, if my bin cost is equal to machine cost (multiplied by 1000 as we need an int) as shown in dataset 1 (please refer your comment) then the ILP is not solved even after 6-8hrs, but if I change cost to a function of cpu and mem of that bin capacity such as 9cpu + 1mem, I get answer within seconds. How is cost of bin affecting the ILP solving time?