Adding/removing backends from an existing mountpoint

johnbent commented 11 years ago

One nice feature of PLFS is how it aggregates multiple backends (which might be multiple storage volumes) into a single namespace. Users who want this feature will often want to grow their capacity by adding new volumes to an existing mountpoint. I suspect this is more important to users using PLFS in flat-file mode than in shared file mode since shared file mode environments are more likely to be checkpoint environments where PLFS is being used for scratch. If users want to grow PLFS in a scratch environment, I'd suggest just making a new mount point.

But users who use PLFS in flat file mode may be interested in maintaining a single namespace into perpetuity and will want to be able to add new volumes to that namespace. Here's some thoughts about how we might one day enable this functionality:

Here's some explanation about adding / removing volumes from an existing PLFS mount. Basically, this conversation breaks down into a bunch of sub-conversations. We need to discuss how both adding and removing volumes affects both existing files and existing directories in each of the three PLFS mount types: shared file, small file, and flat file. None of this is possible today and to do any of these will require engineering effort.

So we have to discuss:

remove volume effect on directories in flat file
remove volume effect on directories in small file
remove volume effect on directories in shared file
add volume effect on directories in flat file
add volume effect on directories in small file
add volume effect on directories in shared file
remove volume effect on files in flat file
remove volume effect on files in small file
remove volume effect on files in shared file
add volume effect on files in flat file
add volume effect on files in small file
add volume effect on files in shared file

Let's also review when we would want to use each mode. All modes work great for aggregating multiple volumes into a single namespace. Shared file mode is if the user also cares about multi-writer performance. Small file mode is if the user has an extreme workload where individual processes each create a very large number of files. Flat file mode is if the user only cares about aggregating multiple volumes into a single namespace. Small file mode is not well tested and is expected to perform poorly for reads so it should probably be avoided until we can test it more.

So to address the 12 sub conversations.

Numbers 1-3 are easy. Removing volumes has no effect on directories.

Numbers 4-6 (adding volumes effect on directories) are slightly more difficult. The way that PLFS aggregates multiple volumes into a single namespace is by replicating the directory namespace across every volume. This is the same for all three PLFS modes. Therefore when a new volume is added, we will have to traverse the complete namespace and must copy the directory namespace from an existing volume to the new volume. We could potentially do this on-line and we could potentially mask this from the user. I'll discuss this in more depth below.

Numbers 7-9. Removing a volume's effect on files. We will need to be told ahead of time that a volume will be removed. Then we will have to traverse the complete namespace on the volume to be removed and copy every byte there to one of the volumes that won't be removed.

Number 10. Add volume effect on files in flat file mode. We will need to traverse the entire namespace and every single file will need to be moved from its current volume to a new volume. We might be able to do this online and mask this from the user. Potentially we could actually avoid doing any moves either by remembering the old set of volumes or by using symlinks. I'll discuss more at the end.

Number 11. Add volume effect on files in small file mode. Interestingly, this is the easiest of all 12. There is nothing to do. This is because a file in small file mode can reside anywhere whereas the other modes expect files (or containers) to be at a specific volume depending on the hash of the filename.

Number 12. Add volume effect on files in shared file mode. This one is similar to #10 but easier since it definitely requires no actual data movement. For this one, we will need to traverse the entire namespace and manipulate only the plfs metadata (i.e. container structure) for every file whereas in flat file we might have to move the actual data.

For all of the ones where we add volumes, we might actually be able to avoid doing any work either synchronously or even altogether. The problem is that after a volume change we will look for entries in the wrong location. What we could do in our configuration file (i.e. plfsrc) would be to remember ALL previous sets of volumes. Then whenever we fail to find a file, we just keep looking for it in every possible location. If we find it, we just use it wherever it is. We could repair it at this point and move it to its more proper location or we could leave it where it is. Whenever we fail to find a directory, we just check whether that directory is supposed to exist by looking at the directory hierarchy in one of the original volumes and if it is supposed to exist, then we will repair it at that time. Now this does make operations slower since we can't synchronously fail operations that are supposed to fail; for example, when a user looks up a file or directory that actually exists nowhere, we will take a long time to figure this out since we will look for that file in multiple locations.

hailbird commented 11 years ago

When adding volume, can we leverage meta-link to redirect the file to the original location? This can avoid the move of file and rescan of the possible location.

johnbent commented 11 years ago

For containers in shared-file mode, absolutely. In fact, plfs_recover already does exactly this. If you somehow have a PLFS container not in its canonical location (which would happen when a volume is added), then plfs_recover merely makes an empty canonical container at the correct location, and uses a metalink to point at the existing container.

But the flat-file code doesn't know about metalinks. Teaching it about them would be a nice way to enable adding volumes to a flat-file mount without requiring data movement.

On Oct 21, 2013, at 8:55 PM, hailbird notifications@github.com wrote:

When adding volume, can we leverage meta-link to redirect the file to the original location? This can avoid the move of file and rescan of the possible location.

— Reply to this email directly or view it on GitHub.

johnbent commented 11 years ago

I was wondering who hailbird was because I saw and replied to this directly through my phone. I've come to github now to identify the mysterious Mr. Hailbird. Hello Haiyun!

plfs / plfs-core

Adding/removing backends from an existing mountpoint #321