Open raduchilom opened 9 years ago
Is it possible to use somehow yarn-cluster mode, so the new driver process would be managed by YARN resource manager inside a cluster?
It is possible to use that, but the problem still remains, because it spins a new process on the server machine in order to launch the context, and on this machine we want to have some memory management for all the created processes.
In yarn-cluster mode the resource management of the driver process lies on YARN. It can launch it on any node in the cluster that has enough resources for it.
Because we are spinning a new JVM with custom amount of memory for each context, we should limit the amount of memory that can be used for all the JVM processes. If the specified memory is full we should not create other contexts until the amount of memory necessary for the new processes is freed.