Explanation on bin packing result generation time

S1LV3RJ1NX commented 1 year ago

Hello @fdabrandao , I have a doubt regarding time it takes to generate bin packing results. I am currently using gurobi solver. The experiment setup is as follows: I have 8K - 4D bins and close to 10K tasks (4D) . I am basically trying to simulate a cloud workload. In that workload task requests arrive anywhere in the range of 10-200. The scenario is as follows:

First set of tasks come lets say 10 tasks, they are placed onto 8K bins and bin sizes are updated
This updated set of bins are used for placing next set of tasks The issue I am facing is if task requests are anywhere between 10-30 VpSolver is able to provide a solution for it. But if task request groups go beyond 30~36 VpSolv takes a lot of time to provide the results. Sometimes even after 6hrs of running the solver I do not have the result. Sometimes, entire system RAM is exhausted and VpSolver halts. Can I know the reason for this? Is there a certain bound in what time results can be generated? It would be really helpful for me to understand this. Requesting your help.

Thanks!

fdabrandao commented 1 year ago

Hi @S1LV3RJ1NX,

The performance is tied substantially to the size of graph replicating all valid patterns. One thing that has most impact is the number of items that fit in each bin. When there are many dimensions, multiple bin types, big capacities, long patterns (many items fit in a single bin) it is hard to create a compact representation of every valid packing pattern. Note that adding a cardinality constraint (e.g., maximum of 3 or 4 items per bin) may eventually help reducing the size of the graph enough to find a solution.

One thing that may be useful in that heuristic methods tend to work well on the instances in which it is hard to solve exactly with VPSolver, since for heuristics the bigger set of possible patterns typically helps finding good solutions. For VPSolver the larger number of valid packing patterns makes it hard to generate a model small enough to be possible to solve.

S1LV3RJ1NX commented 1 year ago

Thanks @fdabrandao, for the explanation! There were cases when VPSolver was able to find bin-packing solutions immediately. Like I had a sample trace with 100 tasks and ~250 bins, 100 tasks were given all at once. I got the solution within a minute. How can I explain those? Is there a mathematical way to say based on these many bins and these many tasks it will take so and so amt of time?

fdabrandao commented 1 year ago

What was the maximum number of tasks in a single bin in the optimal solution? If you could provide the instances it should be easy to see why some are easy and some are hard. Jusf having small items in an instance can make it substantially harder for VPSolver as the number of valid packing patterns grows really fast. Groping small tasks in a bigger task would be a way to reduce the graph size.

S1LV3RJ1NX commented 1 year ago

Hello @fdabrandao , I am attaching few csv files for reference which I am using for VPSolv. Check only cpu, mem, disk and net cols from these files.

Here I am using ms_general_100_baseline_test.csv to be placed all at once on vm_alibaba_baseline.csv. Bin cost are given by VM cost in the column. Here I get solun very quickly.

For another experiement, I am trying to place google_tasks_10k_test_new_ts.csv approx. 10k items on to google_vms_10k_max_load.csv Since here I have 10k items, I am placing them in groups of 16 at once. If I try with groups of 64 or above it takes a lot of time to generate the solution. Sometimes even after 7~10 hrs VPSolver is still finding the solution. Again bin cost is one mentioned as VM cost in CSV.

fdabrandao commented 1 year ago

For the first experiments it looks like patterns should be very short. How many tasks were placed on each machine?

For the second one, I believe you sent google_vms_10k_max_load.csv twice so I could not see the tasks. However, with that amount of items, even if two or three tasks can be placed on each machine, the graph representing the patterns can get quite big.

S1LV3RJ1NX commented 1 year ago

@fdabrandao apologies for that, I have updated the csvs please check.

fdabrandao commented 1 year ago

In the first dataset you have big items such as:

cid	ts	cpu	mem	disk	net	duration
c_15577	180	5	13	150	16	42990
c_83876	720	5	13	150	16	31770
c_62178	720	5	14	150	16	13380
c_52198	720	2	9	60	1	13320
c_44172	720	2	12	60	1	12810
c_93506	720	4	12	120	2	40470

for machines such as:

mid	cpu	mem	disk	net	cost
v_235	8.0	32.0	240.0	20.0	0.326
v_236	8.0	32.0	240.0	20.0	0.326
v_237	8.0	32.0	240.0	20.0	0.326
v_238	8.0	32.0	240.0	20.0	0.326
v_239	8.0	32.0	240.0	20.0	0.326

It looks like most solutions will have one or two tasks assigned to each machine for the first test. Therefore, the patterns can be reasonably short.

In the second dataset you have many tiny items:

cpu	mem	disk	net	duration	cid	ts
1	1	1	1	1285	c_0	603
1	1	0	1	151	c_1	603
2	1	1	1	5472	c_2	603
2	1	1	1	32353	c_3	603

for big machines:

mid	cpu	mem	disk	net	cost
v_0	4	16	120	10	0.186
v_1	4	16	120	10	0.186
v_2	4	16	120	10	0.186
v_3	4	16	120	10	0.186
v_4	4	16	120	10	0.186
v_5	4	16	120	10	0.186

...

This results in very long patterns that are very hard to represent in a compact way. For this type of dataset you can use an assignment-based model more effectively. You can learn about the graph generation method at https://research.fdabrandao.pt/papers/PhDThesisBrandao.pdf and the assignment-based models are also mentioned there.

S1LV3RJ1NX commented 1 year ago

@fdabrandao Thank you for your suggesstions these are very helpful. Can you please recommend some github repos for using assignment based models like VPSolver?

S1LV3RJ1NX commented 1 year ago

@fdabrandao I had a unique observation can you please explain me the reasoning behing it? If I take 100 samples form first dataset, and around 267 bins a mix of each categories, if my bin cost is equal to machine cost (multiplied by 1000 as we need an int) as shown in dataset 1 (please refer your comment) then the ILP is not solved even after 6-8hrs, but if I change cost to a function of cpu and mem of that bin capacity such as 9cpu + 1mem, I get answer within seconds. How is cost of bin affecting the ILP solving time?

fdabrandao / vpsolver

Explanation on bin packing result generation time #14