apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 597 forks source link

`heron update` re-packing exception #2576

Open huijunw opened 6 years ago

huijunw commented 6 years ago

how to reproduce:

  1. on github master ~/bin/heron submit --deploy-deactivated --verbose local ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology 3
  2. update it ~/bin/heron update --verbose --dry-run local WordCountTopology --component-parallelism consumer:4

IMO, the heron update should be able to update the above example topology successfully by simply adding one more container.

The heron update throws exception and fails to update. See below.

[2017-11-20 14:08:21 -0800] [FINE] com.twitter.heron.scheduler.RuntimeManagerMain: Exception when submitting topology 
com.twitter.heron.spi.packing.PackingException: Could not initialize containers using existing packing plan
    at com.twitter.heron.packing.builder.PackingPlanBuilder.initContainers(PackingPlanBuilder.java:259)
    at com.twitter.heron.packing.builder.PackingPlanBuilder.addInstance(PackingPlanBuilder.java:153)
    at com.twitter.heron.packing.builder.PackingPlanBuilder.addInstance(PackingPlanBuilder.java:141)
    at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.placeFFDInstance(FirstFitDecreasingPacking.java:312)
    at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.assignInstancesToContainers(FirstFitDecreasingPacking.java:265)
    at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.getFFDAllocation(FirstFitDecreasingPacking.java:246)
    at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.repack(FirstFitDecreasingPacking.java:180)
    at com.twitter.heron.scheduler.RuntimeManagerRunner.buildNewPackingPlan(RuntimeManagerRunner.java:304)
    at com.twitter.heron.scheduler.RuntimeManagerRunner.updateTopologyHandler(RuntimeManagerRunner.java:183)
    at com.twitter.heron.scheduler.RuntimeManagerRunner.call(RuntimeManagerRunner.java:81)
    at com.twitter.heron.scheduler.RuntimeManagerMain.callRuntimeManagerRunner(RuntimeManagerMain.java:448)
    at com.twitter.heron.scheduler.RuntimeManagerMain.manageTopology(RuntimeManagerMain.java:396)
    at com.twitter.heron.scheduler.RuntimeManagerMain.main(RuntimeManagerMain.java:317)
Caused by: com.twitter.heron.packing.ResourceExceededException: Insufficient container resources to add instancePlan {component-name: consumer, task-id: 5, component-index: 1, instance-resource: {cpu: 1.000000, ram: ByteAmount{1 GB (1073741824 bytes)}, disk: ByteAmount{1 GB (1073741824 bytes)}}} to container {containerId=2, instances=[{component-name: word, task-id: 2, component-index: 1, instance-resource: {cpu: 1.000000, ram: ByteAmount{1 GB (1073741824 bytes)}, disk: ByteAmount{1 GB (1073741824 bytes)}}}], capacity={cpu: 2.000000, ram: ByteAmount{4 GB (4294967296 bytes)}, disk: ByteAmount{2 GB (2147483648 bytes)}}, paddingPercentage=10}
    at com.twitter.heron.packing.builder.PackingPlanBuilder.getContainers(PackingPlanBuilder.java:392)
    at com.twitter.heron.packing.builder.PackingPlanBuilder.initContainers(PackingPlanBuilder.java:256)
    ... 12 more
Caused by: com.twitter.heron.packing.ResourceExceededException: Adding ByteAmount{1 GB (1073741824 bytes)} bytes of disk to existing ByteAmount{1 GB (1073741824 bytes)} bytes with 10 percent padding would exceed capacity ByteAmount{2 GB (2147483648 bytes)}
    at com.twitter.heron.packing.builder.Container.assertHasSpace(Container.java:170)
    at com.twitter.heron.packing.builder.Container.add(Container.java:77)
    at com.twitter.heron.packing.builder.PackingPlanBuilder.addToContainer(PackingPlanBuilder.java:417)
    at com.twitter.heron.packing.builder.PackingPlanBuilder.getContainers(PackingPlanBuilder.java:390)
    ... 13 more

[2017-11-20 14:08:21 -0800] [ERROR]: Could not initialize containers using existing packing plan
[2017-11-20 14:08:21 -0800] [ERROR]: Failed to update topology in dry-run mode: WordCountTopology
[2017-11-20 14:08:21 -0800] [DEBUG]: Elapsed time: 0.514s.
avflor commented 6 years ago

Hi @huijunw .The FFD repacking algo will only work if you have done the initial packing with the FFD algorithm. Is that the case here?

huijunw commented 6 years ago

if a topology uses RoundRobinPacking packing alg, what re-packing alg should this topology use?

avflor commented 6 years ago

The idea was that RoundRobin packing should be replaced by resource compliant round robin packing and that the scaling capability would work only for topologies that have updated to the resource compliant algo.