Closed binghamchris closed 8 years ago
Just found a note in issue https://github.com/rancher/convoy/issues/73 indicating that there's a non-configurable 60 second timeout for all commands.
So I had a bit of a rethink and tested pigz (parallel gzip), and found that it just bearly manages to compress the 4.1GB volume snapshot in under a minute on my system:
# date; pigz 5204cf2a-5a95-4362-b853-7a11b7f07b33_724390fb-2ca6-4590-82b7-50d1a6fbde67.tar.gz.tmp ; date
Mon 11 Apr 11:25:07 CEST 2016
Mon 11 Apr 11:26:06 CEST 2016
So I'd like to propose two things:
convoy daemon
command so that it can be enabled on systems with pigz installedHi @binghamchris
You should able to use --cmd-timeout
parameter for latest release at: https://github.com/rancher/convoy/releases/tag/v0.5.0-rc1
Hi @yasker
Thanks to you and @jinuxstyle for the update :) However I'm afraid there's appears to be a bug in v0.5.0-rc1. While attempting to test the update I ran into this:
# convoy snapshot create gitlab-prod-data --name snap-gitlab-prod-data-1463237239 --cmd-timeout 10m
Incorrect Usage.
NAME:
convoy snapshot create - create a snapshot for certain volume: snapshot create <volume>
USAGE:
convoy snapshot create [command options] [arguments...]
OPTIONS:
--name name of snapshot
ERRO[0000] Error when executing command: flag provided but not defined: -cmd-timeout
{
"Error": "Error when executing command: flag provided but not defined: -cmd-timeout"
}
The position of the --cmd-timeout
argument doesn't affect the outcome; the same error message is displayed regardless.
I found that a "CmdTimeout":""
was present in /var/lib/rancher/convoy/convoy.cfg, so I set that as follows:
{"Root":"/var/lib/rancher/convoy","DriverList":["vfs"],"DefaultDriver":"vfs","MountNamespaceFD":"","IgnoreDockerDelete":false,"CreateOnDockerMount":false,"CmdTimeout":"10m"}
However this doesn't appear to have had any impact. Running a snapshot command against a large volume (the same one that prompted me to file this issue orginally actually) still timed out after 60 seconds:
# date; convoy snapshot create gitlab-prod-data --name snap-gitlab-prod-data-1463237239; date
Sat 14 May 17:04:21 CEST 2016
ERRO[0062] Error response from server, Timeout executing: gzip [/var/lib/rancher/convoy/vfs/snapshots/gitlab-prod-data_snap-gitlab-prod-data-1463237239.tar.gz.tmp], output , error <nil>
{
"Error": "Error response from server, Timeout executing: gzip [/var/lib/rancher/convoy/vfs/snapshots/gitlab-prod-data_snap-gitlab-prod-data-1463237239.tar.gz.tmp], output , error \u003cnil\u003e\n"
}
Sat 14 May 17:05:23 CEST 2016
I've tried to figure out where the issue may be in the code, however I'm afraid it's too sophisticated for my limited coding abilities.
I'm happy to provide any other testing or information I can help though :)
The position of the --cmd-timeout argument doesn't affect the outcome; the same error message is displayed regardless.
The --cmd-timeout option should be specified when starting the convoy daemon. And it's not an valid option for the command line for convoy client as the way you used.
I found that a "CmdTimeout":"" was present in /var/lib/rancher/convoy/convoy.cfg, so I set that as follows:
Did you edit the convoy.cfg manually? If yes, you should restart the convoy daemon to make it effective.
Apologies @jinuxstyle, I misunderstood where this was supposed to be used.
It's now working as intended when the --cmd-timeout
option is used as you directed... I can backup my GitLab volumes again thanks! 😄
When attempting to shapshot a large (multi-GB) volume, convoy reports a timeout executing the gzip command.
The convoy command run is:
convoy snapshot create $vol_name --name $snapshot_name
And the resulting error message is:
Manually executing gzip on the file shows that takes about 2.5 minutes to complete:
Looking at the contents of the Convoy VFS directory, it seems that Convoy is giving up and terminating the gzip process after about a minute:
I've rummaged through the code here, but can't find either where this timeout is set or where it can be configured. I'm wondering if the timout is intentional or not and I'd like to find out how to overcome it please.
Observed with Convoy 0.4.3 on CentOS 7.2 and Docker 1.10.3 on a NUC 6I5SYK (Core i5-6260U, 32GB RAM, PCIe SSD). The multi-GB volume came about as a result of using Convoy to manage volumes for a container running GitLab CE, which uses Git LFS to version control large binary objects (in this case, an ISO).