stratosphere / stratosphere

Stratosphere is now Apache Flink.
https://github.com/apache/incubator-flink
Apache License 2.0
197 stars 84 forks source link

Ozone Memory Layout #132

Open andrehacker opened 11 years ago

andrehacker commented 11 years ago

Hi residents of the Stratosphere,

I found it a bit difficult to get a big picture of the memory usage of Ozone so I created this small comparison graphic (ozone vs. yarn). You can ignore yarn...

Can someone confirm that it looks like this on the high level?

memory

Having such high level picture in the documentation would be great (with the relevant names of the configuration options added). I also omitted details like default values (memory manager now defaults to 0.7*(heap space) if I read correctly from source)

Network Buffers

First I thought (and heard) that the network buffers are direct buffers and not taken from the heap space. But during my experiments (cloud-11) I found out that they seem to be taken from the heap space:

Settings that did not work DEFAULT_NEPHELE_TM_HEAP: 10GB taskmanager.memory.size: 8GB Network buffers: 4GB (numberBuffers: 32 KB bufferSize: 128 KB) => Task manager did not start, not enough heap space for 8+4 GB

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space 
        at eu.stratosphere.nephele.services.memorymanager.spi.DefaultMemoryManager.<init>(DefaultMemoryManager.java:144) 
        at eu.stratosphere.nephele.services.memorymanager.spi.DefaultMemoryManager.<init>(DefaultMemoryManager.java:100) 
        at eu.stratosphere.nephele.taskmanager.TaskManager.<init>(TaskManager.java:322) 
        at eu.stratosphere.nephele.taskmanager.TaskManager.main(TaskManager.java:374)

Settings that worked DEFAULT_NEPHELE_TM_HEAP: 10GB taskmanager.memory.size 8GB Network buffers: 1GB (numberBuffers: 32 KB bufferSize: 32 KB) => worked, there is 1GB heap space left

rmetzger commented 11 years ago

Yes, (Network)Buffers are taken from the heap space, as byte arrays. This was changed by https://github.com/dimalabs/ozone/pull/53.

8+4 makes 12, how should this work with 10 GB heap space?

I don't see any problems with your picture, but you should probably ask @StephanEwen. You could consider renaming "Rest for UDFs .." to "Java Objects: UDFs ..."

StephanEwen commented 11 years ago

It looks good to me, and I agree with Roberts comments.

In the case of using Yarn with Stratosphere, each Yarn Container will use its memory like the TaskManagerHeap. In Addition, yarn has a container daemon running on each machine (afaik) which monitors the resource consumption of the JVMs running inside the individual containers.

andrehacker commented 11 years ago

Okay, thanks for confirmation and the link to #53 - so we don't use any direct buffers any more.

My picture of Hadoop refers to Hadoop Yarn and MapReduce. Yes, every node runs a NodeManager killing containers that use too much memory, which is the only resource considered currently. There is a lot of overhead involved for MapRed+Yarn since it first acquires a new Container (=JVM) for the MapRedApplicationMaster and then acquires additional containers for every Map and Reduce task (there are usually many per job). Also containers cannot be reused in yarn. Not sure if we are affected by this overhead with Stratosphere+Yarn or if we can have long running containers and run all tasks for one node in a single container. But this is a separate issue....

@rmetzger: Yes, my experiments are obsolete now - just posted them because I heard before that we use Direct Buffers and could not bring this together with the errors (I already assumed that we use the heap)

When we update the documentation we need to consider that we no longer use direct buffers here: http://stratosphere2.dima.tu-berlin.de/wiki/doku.php/wiki:clustersetup#configuring_the_network_buffers