yboetz / pyznap

ZFS snapshot tool written in python
GNU General Public License v3.0
199 stars 36 forks source link

How to install it? #1

Closed Skaronator closed 6 years ago

Skaronator commented 6 years ago

Just a quick question: I know something about linux and I also know that Python has a package manager called pip but how to install this?

I tried to install it via pip:

pip3 install git+https://github.com/cythoning/pyznap
Collecting git+https://github.com/cythoning/pyznap
  Cloning https://github.com/cythoning/pyznap to /tmp/pip-n051elol-build
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/lib/python3.6/tokenize.py", line 452, in open
        buffer = _builtin_open(filename, 'rb')
    FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-n051elol-build/setup.py'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-n051elol-build/

And why does it need pytest? It would be only required for development.

yboetz commented 6 years ago

Hi

Yes, pytest is only required for development.

For installing I would suggest the following, which is the way I have it installed.

First of all, I would suggest you install virtualenv + virtualenvwrapper, so you don't have to install the packages on the system python installation.

Then go to the folder where you want pyznap installed (for me that is /opt) and clone the git repository

cd /opt
git clone git@github.com:cythoning/pyznap.git

This will create a folder opt/pyznap/ and download all files from github. Then install the required python packages (best in your virtualenv).

cd pyznap
pip install -r requirements.txt

Copy the config file to /etc/pyznap/ and modify it to suit your system.

mkdir /etc/pyznap
rsync -av /opt/pyznap/pyznap.conf /etc/pyznap/pyznap.conf

Once you've set up your config file correctly then all that remains is letting pyznap run regularly, I would suggest once per hour. For this just open your crontab file

nano /etc/crontab

and add a line like

0 * * * *   root    /path/to/python /opt/pyznap/src/pyznap.py snap >> /var/log/pyznap.log

This will run pyznap once per hour and take/clean snapshots the way you specified in the config. If you also want it to send backups to another destination, then you specify this in the config file and add a second line to your crontab file

0 0 * * *   root    /path/to/python /opt/pyznap/src/pyznap.py send >> /var/log/pyznap.log

This will backup your filesystems once per day at 12pm.

If you want pyznap to only take snapshots, but not delete them at some time, then you run it with the flag pyznap.py snap --take. If you only want it to clean snapshots, then it's pyznap.py snap --clean.

I would also suggest give file ownership to root for all files, s.t. no user can modify them.

chown root:root -R /etc/pyznap
chown root:root -R /opt/pyznap

That should about be it. If you have any problems please let me know.

Skaronator commented 6 years ago

Thanks @cythoning this was damn simple!

Just one question about this config part:

Missing values will be filled automatically if parent is in config

Is there a way to avoid this behavior? Docker generate a lot filesystems and delete them on updates as you can see here: image

Skaronator commented 6 years ago
[StorageZFS/Docker]
hourly = 0
daily = 7
weekly = 8
monthly = 0
snap = yes
clean = yes

[StorageZFS/Docker/*]
snap = no

Would be a good implementation IMO

yboetz commented 6 years ago

At the moment, no. I deliberately designed it such that snapshots are atomic and recursive, and I don't plan to change this. But shouldn't it take snapshots of your docker images as well? If something goes wrong you can then simply roll back this one container. I don't use docker, so I don't exactly know how it stores the containers.

I use LXC (linux containers), and they are stored as zfs filesystems like

lxd/containers/...
lxd/images/...

The downloaded images are stored in lxd/images, and I don't want snapshots of those. So in my config I just tell it to take snapshots of lxd/containers:

[lxd/containers]
hourly = 24
daily = 7
weekly = 4
monthly = 6
yearly = 1
snap  = yes
clean = yes

A syntax like [StorageZFS/Docker/*] might be a good idea, but I have to check if it's easy to implement.

Skaronator commented 6 years ago

I don't wanna say something wrong but the volumes are saved on StorageZFS/Docker (/var/lib/docker) eg. /var/lib/docker/volumes/....

The image and container are stored as separate ZFS Filesystem so StorageZFS/Docker/0007d487e46e6d79815c675cf3343116ec1778ff9d6506b9dbcc0c4e440fe1ab is mounted on /var/lib/docker/zfs/graph/0007d487e46e6d79815c675cf3343116ec1778ff9d6506b9dbcc0c4e440fe1ab.

So the "personal" data for each container are on StorageZFS/Docker and the image and container could be simply redownloaded AFAIK? So no need to backup those. Also the ID always change when updating the container and pulling the latest image.

yboetz commented 6 years ago

I'm not really familiar how docker works, so I don't know how one should handle this, especially with the ID changing after every update. But worst case is you will have a lot of snapshots, but as pyznap automatically cleans them up this should be that big of a problem.

Skaronator commented 6 years ago

Alright I asked on reddit about this and it's safe to say that Docker doesn't need the container and images as backup. The important parts are stored on StorageZFS/Docker (/var/lib/docker/containers) and the Volumes at /var/lib/docker/volumes. The Images and Containers can be recreated on any time so no need to save snapshots of StorageZFS/Docker/<DockerId>.

I'm gonna close this, thanks for the quick responses! Hope to see [StorageZFS/Docker/*] syntax one day :P

ahjohannessen commented 6 years ago

@cythoning Your guidance above wrt. installing this thing would perhaps be a good idea to put in the readme :) It would also be a good thing if you could tag and do tar.gz releases.

Would you be interested in a PR that introduces a docker file and perhaps later some CI travis stuff that builds and publishes an image?

yboetz commented 6 years ago

I will update the readme later today if I have some time, but yes, that would be a good idea :). I haven't released any software before, so have no idea on how to go about this. I'll look into tagging, but I might want to work on it a bit more before I do an official release. As for now I think installing it via git clone is simple enough :).

I've also never used docker and travis CI, so I don't know about this. Would it be beneficial to run pyznap in a docker container? I would have thought that storage management is mostly done on the host, not in a container.

ahjohannessen commented 6 years ago

I've also never used docker and travis CI, so I don't know about this. Would it be beneficial to run pyznap in a docker container? I would have thought that storage management is mostly done on the host, not in a container.

It would be controlled by the host via cron or similar, but it might be a little bit simpler for some of us to wrap the binaries in a container as well.

ahjohannessen commented 6 years ago

@cythoning I am currently setting up servers with zfs, so now I am evaluating backup / replication tools. I know sanoid is popular, but just the idea of perl makes me cringe a bit, not that anything is wrong with it. If I end up using your tool, you can expect PRs wrt. CI and docker.

How similar are pyznap and sanoid/synoid ? I suppose the latter is more feature rich?

yboetz commented 6 years ago

I wrote pyznap as a clone of sanoid, so they are pretty similar. I've used sanoid until recently, and I wanted pyznap to be able to carry on with cleaning up old sanoid snapshots, I even kept the naming convention for snapshots (except for @pyznap... instead of @autosnap...). Sanoid does have a few more features, as it also has been around longer. But I've started working on pyznap with a few goals in mind and thus pyznap does a few things different than sanoid. A few advantages I can think of:

I've been using sanoid for almost a year until recently, so I can wholeheartedly recommend it. But it had a few minor things that bugged me, that's why I started pyznap. I've been running pyznap on my server for a month now and it works perfect for me. If you want to use pyznap I'd obviously be happy to help if you have any problem with setup, and I'll also gladly accept PRs.

I'll also plan to add some more features in the near future, like "frequent" snapshots (every 15min or so).

yboetz commented 6 years ago

It would be controlled by the host via cron or similar, but it might be a little bit simpler for some of us to wrap the binaries in a container as well.

I wanted pyznap to be easy to setup and run, so it should be simple enough to run even without docker. All you have to do is git clone, install two packages, copy the config and setup a cronjob ;). But if it works better for you then go with docker.

ahjohannessen commented 6 years ago

I think I'll give your tool a shot. I was perhaps a bit too optimistic wrt. docker. I completely forgot that zfs userspace tools need to be available for the container - this makes versioning of those zfs tools a bit interesting (container vs. host with zfs versions etc.) - As we use Ubuntu Xenial for database servers I think I'll just use ansible to install pyznap for now, however tags and tar.gz releases would be a natural next step for pyznap if you wish uptake. That being said, I might take a shot at containerization if I am bored some weekend, but it would probably be too specific to my needs with regards to the container being based on Xenial and use same zfs binaries as the hosts and so on.

I program in Scala for work and Python seems a bit more to my taste than Pearl if I need to figure out what is going on in pyznap or contribute some code. I like the fact that you have tests.

ahjohannessen commented 6 years ago

I wanted pyznap to be easy to setup and run, so it should be simple enough to run even without docker. All you have to do is git clone, install two packages, copy the config and setup a cronjob ;). But if it works better for you then go with docker.

I am not sure how Python people handle this, but would it perhaps be an idea to have two requirement.txt files? One for development with test related dependencies and one for production without those?

yboetz commented 6 years ago

Sounds great. Let me know if anything doesn't work or config setup is complicated. I also use Ubuntu Xenial for my server, and I installed it the way I described above. Works perfectly.

I'll also split the requirements file, only paramiko and configparser are needed for running pyznap.