genezys / docker-gitlab

Dockerized Omnibus GitLab
https://registry.hub.docker.com/u/genezys/gitlab/
Apache License 2.0
2 stars 12 forks source link

How to backup and restore the gitlab data? #4

Closed NanXiao closed 7 years ago

NanXiao commented 9 years ago

Hi genezys,

I am a newbie for docker and use your gitlab image now.

After executing the commands:

docker run --name gitlab_data genezys/gitlab:7.5.2 /bin/true
docker run --detach --name gitlab --publish 8080:80 --publish 2222:22 --volumes-from gitlab_data genezys/gitlab:7.5.2 

How to backup and restore the gitlab data? I also post a question in SO, and hope you can help it, thanks very much in advance!

Best Reagrds Nan Xiao

genezys commented 9 years ago

I see 2 options:

You can use the GitLab official backup process. This will require that you run the backup process in a container, most likely the existing running application container. I don't know if the GitLab backup process requires that GitLab services are running. If it does, you will have to run the backup process using docker exec with your running application container. The backup process will generate a backup in the volumes that you can retrieve with docker cp.

Or you can use the Docker way of backuping volumes. This is explained in the Docker documentation. The idea is to run yet another container, attach the volumes of your existing data container and TAR everything to a local file. For restoring, use the same idea to decompress your TAR on your volumes. This will backup the whole filesystem of your database and repositories so it may generate a larger backup.

I have not yet used the backup process myself, so any workable solution you come up with could be documented in the README.

NanXiao commented 9 years ago

Using docker run --name gitlab_data genezys/gitlab:7.5.2 /bin/true command will make this container for storage purpose. I think I should backup the whole image, right? The Docker way may not work, since there are many folders need to be backupped.

genezys commented 9 years ago

You cannot backup the data container image since it does not actually contain your data. The only purpose of the data container is to link to volumes that will contain your data. Volumes are not part of the image.

In order to backup the volumes of the data container, you should use a command like :

sudo docker run --volumes-from gitlab_data -v $(pwd):/backup ubuntu tar cvf /backup/gitlab_data.tar /var/opt/gitlab /var/log/gitlab /etc/gitlab
genezys commented 9 years ago

I forgot to add that this command will generate a file in the current directory called gitlab_data.tgz that is your backup.

NanXiao commented 9 years ago

@genezys Now I can backup the data following your command:

docker run --volumes-from gitlab_data -v $(pwd):/backup genezys/gitlab:7.5.2 tar cvf /backup/gitlab_data.tar /var/opt/gitlab /var/log/gitlab /etc/gitlab

But when I mount the data:

docker run --name gitlab_data --volume /var/opt/gitlab --volume /var/log/gitlab --volume /etc/gitlab genezys/gitlab:7.5.2 /bin/true
docker run --detach --name gitlab_app --publish 8080:80 --publish 2222:22 --volumes-from gitlab_data genezys/gitlab:7.5.2

I find the gitlab still can't find the previous data (my test.git project):

root@3e0fd2ce0f78:/var/opt/gitlab/git-data/repositories/root# ls -alt
total 0
drwxrwx---. 2 git git  6 May  8 06:52 .
drwxrws---. 3 git git 17 May  8 06:52 ..

But the data can be found on host:

[root@localhost root]# ls -alt
total 8
drwxrwx---. 4 polkitd ssh_keys   41 May  8 02:34 .
drwxrwx---. 7 polkitd ssh_keys 4096 May  8 02:34 test.git
drwxrwx---. 7 polkitd ssh_keys 4096 May  8 02:34 test.wiki.git
drwxrws---. 3 polkitd ssh_keys   17 May  8 02:32 ..

Otherwise:

[root@localhost root]# ls -lt /var/opt/gitlab/git-data/repositories/root/test.git/hooks
lrwxrwxrwx. 1 polkitd ssh_keys 47 May  8 02:34 /var/opt/gitlab/git-data/repositories/root/test.git/hooks -> /opt/gitlab/embedded/service/gitlab-shell/hooks

The opt/gitlab directory isn't backed up. Should we also need to back up this folder?

Thanks very much in advance! And wait for your response!

genezys commented 9 years ago

You may have done something strange, your data should not be one the host, never. It should be contained inside a Docker volume only.

You may have configured your container with a host-based volume or something like that.

I suggest that you docker inspect your containers in order to understand your volume configuration.

NanXiao commented 9 years ago

@genezys : The root cause is that I don't mapping the host and container volume. The correct command should be:

docker run --name gitlab_data --volume /var/opt/gitlab:/var/opt/gitlab --volume /var/log/gitlab:/var/log/gitlab --volume /etc/gitlab:/etc/gitlab genezys/gitlab:7.5.2 /bin/true

But I still have 2 questions: (1) Because /opt/gitlab holds application code for GitLab and its dependencies, so it doesn't need to be backed up, right? (2) Could you update back up command in the README.md?

Thanks Nan Xiao

genezys commented 9 years ago

/opt/gitlab is indeed contained in the application image so is not data and should not be backed up.

I do not know why you used host-based volumes, I prefer to use Docker-managed volumes as I don't have to worry where they will be stored on my host.

I agree that I should add how to backup and restore your GitLab installation in the documentation. I will keep this issue open for that.

NanXiao commented 9 years ago

@genezys Using host-based volumes is because it is the backup data. E.g., I have used the following command to back up data:

docker run --volumes-from gitlab_data -v $(pwd):/backup genezys/gitlab:7.5.2 tar cvf /backup/gitlab_data.tar /var/opt/gitlab /var/log/gitlab /etc/gitlab 

Then for some reason, the gitlab crashes, so I need to restore the data: (1) Uncompress the data on host:

tar xvf gitlab_data.tar -C /

(2) Mount the data volume and do the mapping:

docker run --name gitlab_data --volume /var/opt/gitlab:/var/opt/gitlab --volume /var/log/gitlab:/var/log/gitlab --volume /etc/gitlab:/etc/gitlab genezys/gitlab:7.5.2 /bin/true

(3) Run gitlab server:

docker run --detach --name gitlab_app --publish 8080:80 --publish 2222:22 --volumes-from gitlab_data genezys/gitlab:7.5.2

Is there any good method?

genezys commented 9 years ago

Why did you uncompress the data on the host directly and not inside the volume gitlab_data?

I would uncompress the backup by first mouting the TAR file as a host-based volume then uncompress it inside the container.

docker run --volumes-from gitlab_data -v $(pwd)/gitlab_data.tar:/backup.tar ubuntu tar xvf /backup.tar -C /

This will uncompress the TAR in the volumes of gitlab_data and your GitLab is restored.

NanXiao commented 9 years ago

@genezys : Yeah, your method is better than mine! BTW, in README.md, you mention "We assume using a data volume container, this will simplify migrations and backups.", but to be honest, I still can't understand why this method is simple, could you explain it? Thanks in advance!

genezys commented 9 years ago

The "other" way I was thinking about is to store your data inside the application container. The problem with this is that you cannot start a newer version of the GitLab container without losing your data.

For backups, the "simplicity" comes from the fact that volumes allow to backup the actual filesystem without worrying about a functional application-aware backup.