Open S1LV3RJ1NX opened 1 year ago
Hi @S1LV3RJ1NX,
The performance is tied substantially to the size of graph replicating all valid patterns. One thing that has most impact is the number of items that fit in each bin. When there are many dimensions, multiple bin types, big capacities, long patterns (many items fit in a single bin) it is hard to create a compact representation of every valid packing pattern. Note that adding a cardinality constraint (e.g., maximum of 3 or 4 items per bin) may eventually help reducing the size of the graph enough to find a solution.
One thing that may be useful in that heuristic methods tend to work well on the instances in which it is hard to solve exactly with VPSolver, since for heuristics the bigger set of possible patterns typically helps finding good solutions. For VPSolver the larger number of valid packing patterns makes it hard to generate a model small enough to be possible to solve.
Thanks @fdabrandao, for the explanation! There were cases when VPSolver was able to find bin-packing solutions immediately. Like I had a sample trace with 100 tasks and ~250 bins, 100 tasks were given all at once. I got the solution within a minute. How can I explain those? Is there a mathematical way to say based on these many bins and these many tasks it will take so and so amt of time?
What was the maximum number of tasks in a single bin in the optimal solution? If you could provide the instances it should be easy to see why some are easy and some are hard. Jusf having small items in an instance can make it substantially harder for VPSolver as the number of valid packing patterns grows really fast. Groping small tasks in a bigger task would be a way to reduce the graph size.
Hello @fdabrandao , I am attaching few csv files for reference which I am using for VPSolv. Check only cpu, mem, disk and net cols from these files.
Here I am using ms_general_100_baseline_test.csv to be placed all at once on vm_alibaba_baseline.csv. Bin cost are given by VM cost in the column. Here I get solun very quickly.
For another experiement, I am trying to place google_tasks_10k_test_new_ts.csv approx. 10k items on to google_vms_10k_max_load.csv Since here I have 10k items, I am placing them in groups of 16 at once. If I try with groups of 64 or above it takes a lot of time to generate the solution. Sometimes even after 7~10 hrs VPSolver is still finding the solution. Again bin cost is one mentioned as VM cost in CSV.
For the first experiments it looks like patterns should be very short. How many tasks were placed on each machine?
For the second one, I believe you sent google_vms_10k_max_load.csv twice so I could not see the tasks. However, with that amount of items, even if two or three tasks can be placed on each machine, the graph representing the patterns can get quite big.
@fdabrandao apologies for that, I have updated the csvs please check.
In the first dataset you have big items such as:
cid | ts | cpu | mem | disk | net | duration |
---|---|---|---|---|---|---|
c_15577 | 180 | 5 | 13 | 150 | 16 | 42990 |
c_83876 | 720 | 5 | 13 | 150 | 16 | 31770 |
c_62178 | 720 | 5 | 14 | 150 | 16 | 13380 |
c_52198 | 720 | 2 | 9 | 60 | 1 | 13320 |
c_44172 | 720 | 2 | 12 | 60 | 1 | 12810 |
c_93506 | 720 | 4 | 12 | 120 | 2 | 40470 |
for machines such as:
mid | cpu | mem | disk | net | cost |
---|---|---|---|---|---|
v_235 | 8.0 | 32.0 | 240.0 | 20.0 | 0.326 |
v_236 | 8.0 | 32.0 | 240.0 | 20.0 | 0.326 |
v_237 | 8.0 | 32.0 | 240.0 | 20.0 | 0.326 |
v_238 | 8.0 | 32.0 | 240.0 | 20.0 | 0.326 |
v_239 | 8.0 | 32.0 | 240.0 | 20.0 | 0.326 |
It looks like most solutions will have one or two tasks assigned to each machine for the first test. Therefore, the patterns can be reasonably short.
In the second dataset you have many tiny items:
cpu | mem | disk | net | duration | cid | ts |
---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1285 | c_0 | 603 |
1 | 1 | 0 | 1 | 151 | c_1 | 603 |
2 | 1 | 1 | 1 | 5472 | c_2 | 603 |
2 | 1 | 1 | 1 | 32353 | c_3 | 603 |
for big machines:
mid | cpu | mem | disk | net | cost |
---|---|---|---|---|---|
v_0 | 4 | 16 | 120 | 10 | 0.186 |
v_1 | 4 | 16 | 120 | 10 | 0.186 |
v_2 | 4 | 16 | 120 | 10 | 0.186 |
v_3 | 4 | 16 | 120 | 10 | 0.186 |
v_4 | 4 | 16 | 120 | 10 | 0.186 |
v_5 | 4 | 16 | 120 | 10 | 0.186 |
...
This results in very long patterns that are very hard to represent in a compact way. For this type of dataset you can use an assignment-based model more effectively. You can learn about the graph generation method at https://research.fdabrandao.pt/papers/PhDThesisBrandao.pdf and the assignment-based models are also mentioned there.
@fdabrandao Thank you for your suggesstions these are very helpful. Can you please recommend some github repos for using assignment based models like VPSolver?
@fdabrandao I had a unique observation can you please explain me the reasoning behing it? If I take 100 samples form first dataset, and around 267 bins a mix of each categories, if my bin cost is equal to machine cost (multiplied by 1000 as we need an int) as shown in dataset 1 (please refer your comment) then the ILP is not solved even after 6-8hrs, but if I change cost to a function of cpu and mem of that bin capacity such as 9cpu + 1mem, I get answer within seconds. How is cost of bin affecting the ILP solving time?
Hello @fdabrandao , I have a doubt regarding time it takes to generate bin packing results. I am currently using gurobi solver. The experiment setup is as follows: I have 8K - 4D bins and close to 10K tasks (4D) . I am basically trying to simulate a cloud workload. In that workload task requests arrive anywhere in the range of 10-200. The scenario is as follows:
Thanks!