Mouting ZFS snapshot from S3

presslabs / z3

Backup your ZFS snapshots to S3.

Other

381 stars 53 forks source link

The performed ZFS snapshot is not subject to the next modification. Snapshots are a set of blocks that have been modified and can no longer be changed.

Object-oriented storage is optimized for storing data that is "write-once read-many delete-eventually". Snapshots fully meet these assumptions fully.

In the case of Amazon S3, reading data may have comparable speed with reading data from disks that are also internally connected via the network as well.

Have you considered implementing ZFS snapshot mounting directly from S3 to enable backup data preview? If necessary, we can consider a local cache for some blocks. Similiar approach use TrailDB ( http://tech.nextroll.com/blog/data/2016/11/29/traildb-mmap-s3.html ) and MezzFS ( https://medium.com/netflix-techblog/mezzfs-mounting-object-storage-in-netflixs-media-processing-platform-cda01c446ba ) with very interesting result.

It is worth noting that volume snapshots in AWS EBS are mounted on-line and the necessary blocks are downloaded locally only at the time of access ( https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html ). Implementing the ability to read backups would open up similar possibilities.

I notice that it will not be possible to read blocks with the compression used. However, I think that the potentially unlimited capacity of snapshots and free access to them offset this limitation, especially since ZFS can support compression and block-level encryption on its own.

Interested in this issue, I experimented with creating a ZFS filesystem based on a block device provided by s3backer. This allows me to store snapshots on S3, while also allowing me to consolidate the snapshots that are stored in s3 and delete them in any order. s3backer supports removing blocks where TRIM has been done to free up space in s3 and supports thin-provisiong of space (you can create a 1TB block device, and used space from s3 perspective is space used in zfs).

I see a risk in terms of the consistency of the data saved on the disk, which results from the very essence of s3backer. I notice that there is a local cache to reduce read-after-write consistency problems, but the backup system has clearly separated read and write sequences:

writing are made when a new snapshot is stored,
readings are taken as the snapshot is received back (usually several hours or weeks after writing).

I can imagine introducing a limit on how often different operations can be performed to ensure data security. In many cases, the restore operation is latency tolerant (may start 5-10 minutes later), especially when balancing its effectiveness.

presslabs / z3

Mouting ZFS snapshot from S3 #32