Heavy parallel Save/Load exhausts memory, OOM killer then kills manager

dhiltgen commented 9 years ago

I'm running a 17 node cluster, and when I do a Save/Load of an image I just built on the cluster (to distribute it throughout the cluster so I can run it on any node) the manager memory usage spikes. For example, a ~550M image pushes the resident size of the manager up to 19G. When I try to run a few of these Save/Load scenarios concurrently, I can easily get the manager big enough to trigger the kernel's OOM killer and kill the manager.

chanwit commented 9 years ago

It seems we need the bypass mode for this and similar scenarios.

chanwit commented 9 years ago

@dhiltgen this is not to solve your problem as I have been facing it too.

My workaround in the meantime is to have a local registry running on the same machine as the manager. After build, I push the image to the registry first. When pulling, it's not triggering the OOM for me.

FYI, my image is ~200M but my cluster size is 50-node.

abronan commented 9 years ago

Generally speaking, we should have some sort of Memory and CPU cap for the Manager process that can be controlled somehow. This is especially needed as docker-machine runs the Manager on the same machine than the Agents and we don't want the Manager to eat up all the resources (I saw that this is also a general practice, users are launching a Manager alongside an Agent).

Using the registry is a good workaround for this even though we still want save and load to work as expected without crashing the Manager daemon. This probably means queuing save and load requests and execute them in parallel as long as we stay under the CPU and memory cap.

Also the amount of resources used by the Manager should be deducted from the available CPU and RAM in the output of docker info if running on the same machine than an Agent (merging the Manager and the Agent seems like a good option for this). Imho the Manager should be a special case of a local scheduler: each command that goes through the Manager should account for the resources used and we should check that the Manager process stays light enough to not hinder the overall resources available on the machine.

aluzzardi commented 9 years ago

Weird ...

load/save are streaming without buffering to ram

dongluochen commented 8 years ago

DockerClient makes memory copies of the image to be uploaded. We may want to change the implementation.

vieux commented 8 years ago

@jimmyxian can you please take a look ?

jimmyxian commented 8 years ago

@vieux Sure, trying to solve recently on this issue. @dongluochen Any good advice on this?

dongluochen commented 8 years ago

@jimmyxian I just added a PR https://github.com/samalba/dockerclient/pull/192 . Please take a look.

jimmyxian commented 8 years ago

@dongluochen Good point. Load is not streaming in dockerclient.

Also, maybe another problem is here(https://github.com/docker/swarm/blob/master/cluster/swarm/cluster.go#L511)

If some docker engines load slow, some load fast, then swarm manager have to use more memory to store data.
Or, client --> swarm is fast, swarm --> engines is slow, the result may be same as above case.

WDYT?

aluzzardi commented 8 years ago

@jimmyxian I think that is taken care of by io.Copy - it stores only a buffer and won't read until the writes are done

dongluochen commented 8 years ago

@jimmyxian Under stream mode, io.Copy is dominated by the slowest writer. Swarm manager doesn't use more memory, it is as slow as the slowest connection, which is fine. The real problem with streaming mode is a broken node would fail load operation because the io.Pipe is closed at io.Copy. We may need to see if node refreshment is adequate for such problem.

nazar-pc commented 8 years ago

In #1464 I didn't have any failed nodes - just fresh cluster with 2 nodes and both were fine.

dongluochen commented 8 years ago

@dhiltgen @nazar-pc #1494 updates dockerclient to LoadImage in stream mode. Please rebuild and test if this fixes your problem. There should not be memory spike from the command. The side effect is unhealthy nodes can block load command. I think the right solution is to fail nodes faster.

nazar-pc commented 8 years ago

Is there nightly build of Swarm image in Docker Hub to try with Docker Machine?

abronan commented 8 years ago

@nazar-pc Yes, you can use: dockerswarm/swarm:latest

nazar-pc commented 8 years ago

@dongluochen, I confirm that both save and load work fine now.

abronan commented 8 years ago

Fixed by #1494

docker-archive / classicswarm

Heavy parallel Save/Load exhausts memory, OOM killer then kills manager #1163