stanford-rc / fuse-migratefs

Filesystem overlay for transparent, distributed migration of active data across separate storage systems.
GNU General Public License v3.0
40 stars 10 forks source link
filesystem fuse-filesystem hpc-clusters overlay

migratefs

Build Status

migratefs is a filesystem overlay for transparent, distributed migration of active data across separate storage systems.

This project started as a fork of fuse-overlayfs, an implementation of overlay+shiftfs in FUSE for rootless containers, but the project has significantly diverged since then, and operates on very different premises.

About

migratefs is a FUSE-based filesystem overlay designed to semalessly migrate data from one filesystem to another. It aggregates multiple, separate filesystems or directories, to present a stacked view of their contents and allows migration of modified data from the lower to the upper layer.

Table of contents

Description

The purpose of migratefs is to provide a way to migrate active data across storage systems with minimal user impact. As far as users or applications are concerned, files stay in the same place, they can be accessed with the same paths during the migration, while their contents are transparently migrated from one storage system to another.

It has been designed to:

It is of particular interest to migrate data between network filesystems that are mounted on the same set of clients, but can also be used locally.

Rationale

Data storage life cycle

In scientific computing environments, a substantial numbers of users typically work on large-scale HPC clusters and store their data on parallel, distributed filesystems. Those filesystems have a life time of several years, but aging hardware needs to be replaced at some point, to accommodate evolutions in storage density, I/O performance increases and user needs.

And when storage systems are replaced, data needs to be migrated.

Traditional data migration methods

There are a few typical scenarios that are generally adopted when time comes to retire an older storage system and replace it by a new one:

migratefs helps solve the filesystem data migration problem by letting storage administrators enable a transparent overlay on top of their existing filesystems, that bridges both the old and the new storage systems, and makes every single write() operation on existing files participate in the migration of the active dataset, completely transparently for the users.

migratefs only migrates actively used data, is completely transparent to users and application, and doesn't require any extended downtime.

Method copy all the data users move their files migratefs
ignore inactive data :no_entry: :heavy_check_mark: :heavy_check_mark:
transparent for users :no_entry: :no_entry: :heavy_check_mark:
can be done online :no_entry: :heavy_check_mark: :heavy_check_mark:
distributed data transfers possible possible :heavy_check_mark:

Use case

migratefs has been developed to solve the typical case of a HPC center needing to retire a shared, automatically purged /scratch filesystem, and move all of its actively-used data to a new storage system.

Many computing centers define purge policies on their large filesystems, that automatically delete files based on their age, access patterns, etc. Enabling migratefs over purged filesystems makes it easier to define the migration period, as files that are actively used will be transferred over to the new filesystem, while the files that sit idle will progressively be removed by the existing purge policies. In the end, all the new data will have been moved over to the new filesystem, and the old filesystem will be empty, so it could be retired and decommissioned.

migratefs is designed to be used temporarily, over a period of time during which data will be migrated between filesystems. When the migration is done, the migratefs layer can be removed and normal filesystem operations can be resumed.

Direct access to the underlying filesystem layers is always possible, although in case of a data migration, it's better to keep the lower levels unmodified. But new files can be written and read directly from the upper layer without any impact on migratefs functioning.

Migration timeline

Let's say you have a /scratch filesystem that needs to be retired, and you have a new filesystem ready to replace it already. The typical timeline for a data migration with migratefs would look like this:

Once the migration is over, users continue to use /scratch as before, except now, all their files are on the new filesystem and the old one has been retired.

<img align="center" src="https://docs.google.com/drawings/d/e/2PACX-1vT0i3mCSl-22U8e-hu3uNH81AN2vH-jgwUnsgBUU1Wc41Quv8x-00DH52zyA6j4D8o1TGVibdEwwjuF/pub?w=981&h=561"/>

During the migration period:

In the end, all the active data will be on /scratch_new, and /scratch_old will be empty. All the active data would have then been migrated in a completely distributed way as each client would have participated to the migration, the old system could be retired, and the new system would be ready to use natively, without any old data lingering around.

Features

High-level

In practice

Usage

Requirements

migratefs requires libfuse 3.x.

A specfile for CentOS 7 can be found here.

Installation

$ ./autogen.sh
$ ./configure
$ make

To build a RPM:

$ make rpm

Execution

$ migratefs -o lowerdir=/oldscratch,upperdir=/newscratch /scratch

Options

TBW

Limitations

TBW