Closed coolljt0725 closed 7 years ago
Also being affected by this on docker 1.12.1.
It appears from the time a host is provisioned until the time we start seeing errors is very small. From the logs, it DOES look like it recovers but it can get back in to the bad state and stay there.
@coolljt0725 interested in the stress app/script. Is it open source?
@coolljt0725 A couple of weeks ago I ran your test with docker-stress, and also increased the parallelism to 100 and more with 50/100/1000 containers.
I was not able to reproduce the problem.
I believe it has to do with the disk driver speed (I am running on a pretty empty SSD), its overload and maybe disk space and fragmentation. Somebody has infact reported that the stale data issue is easily reproducible on old spinning hard drivers.
At this moment we have not heard enough of such complains and given the issue seems related to slow disks, it does not seem enough of a strong point for a pervasive code rework. Also given the fact there will likely be issues with the other approach as well, few of which were already raised in https://github.com/docker/libnetwork/pull/1135 comments, in my opinion we should hold on on this and revisit later.
We're running into this issue a lot. We have an image with a VOLUME /foo
where there's 1-2GB of data in /foo that needs to be copied from the image into the volume at container creation time. I think that's causing us to run into this issue quickly even though we are running on bare metal hosts with SSDs in a RAID0 config. I can repro with our image with docker-stress
easily.
Our workaround for now is to not make that directory a volume, however we incur a performance penalty at runtime due to that. Is there any other workaround to this issue or any progress to for fixing it?
@aboch This happens on some of our server, I'm sure this has something to do with the driver speed. For now, the workaround for us is to increase the timeout of transientTimeout of boltdb . I hope the libnetwork could set the transientTimeout rather than just use the default timeout. WDYT?
@coolljt0725 After this issue, when we looked at the current timeout, we thought 10 seconds was already a big one. But I agree, we can increase it. Because it is clear something wrong is happening down the chain and that does not seem to be under our control. Minus well we can increase the timeout to a minute, as long as that makes it better. Were you guys able to verify that the bigger timeout improved things ?
Given libnetwork does not create the bolt interface with persistent connection option, it can't currently set the transient timeout. I think it is fine to just change the default in libkv project as you are doing with your PR.
@coolljt0725
What about instead changing libkv to set BoltDB.timeout = options.ConnectionTimeout
if options.PersistConnection==false && options.ConnectionTimeout != 0
.
At the end that is the timeout which is used by bolt.Open()
Then we can control it via libnetwork in https://github.com/docker/libnetwork/blob/release/v0.8/datastore/datastore.go#L136.
@aboch We had improved the timeout to 2 min on our servers and this problem never happened again, but this is still just a workaround, there are a little loss on container starting time, but it's not a big deal at the moment.
What about instead changing libkv to set BoltDB.timeout = options.ConnectionTimeout if options.PersistConnection==false && options.ConnectionTimeout != 0. At the end that is the timeout which is used by bolt.Open() Then we can control it via libnetwork in https://github.com/docker/libnetwork/blob/release/v0.8/datastore/datastore.go#L136.
That's a good idea. :+1:
Closed by #1546
To support container live restore, we persist driver endpoint to store which is a good way for each network driver. But persisting endpoint to store cause a performance issue. It will take take more time to run a container and the situation is worse in parallel. Here is some test results using https://github.com/crosbymichael/docker-stress.
The stress.json is
docker 1.11.2 with live restore(we backport the liver restore patch)
If we increase the concurrent workers of
stress
, there will be a lot oftimeout
error, seedocker 1.11.2 without live restore
There is significant performance decrease with live restore. And also with persisting driver endpoint to store, there are also some consistent issues, so I suggest we can reconsider the
re-construct endpoint
approach, if it can reconstruct, we avoid persisting. I still think the less persisting to store the better. I think @aboch @mavenugo WDYT?