If we rename an open PLFS file, we need to update our in memory data structures to reflect the new name, or we'll be left with an open PLFS file that points to a backing store directory that is no longer there. (I'm mainly looking at ContainerFS here.)
There are many cases of this where we are not attempting to track renames. For example, two independent processes using PLFS have no easy way of knowing when one process renames the other's file.
However, there is a hook for the PLFS FUSE daemon to catch this case: when FUSE renames a file it checks for to see if it is open. If so, it calls the Plfs_fd::renamefd(struct plfs_physpathinfo *ppip_to) call. That call should update all in-memory data structures to reflect the new filename. However, examining the code, it seems like not all filenames are updated.
Start with Container_OpenFile (COF) class and recursively look at its data members. The COF itself has a "string path" and canonical backend that renamefd() does update, but COF also has a WriteFile and Index structures that need to be updated.
The call to WriteFile setPhysPath only updates WriteFile bnode, containter_path, and canback. It does not update WriteFile's subdirpath, subdirback, or the "pahts" C++ map of filenames. WriteFile also points to its own Index structure for write indexing (if that has any filenames, it needs to be updated).
There isn't a call under renamefd() to the Index to update its path info. I'm also wondering about update the C++ map called "chunk_map" which has dropping filenames in it.
It seems like renamefd() needs to update more than it currently does in order to keep the Plfs_fd up to date.
If we rename an open PLFS file, we need to update our in memory data structures to reflect the new name, or we'll be left with an open PLFS file that points to a backing store directory that is no longer there. (I'm mainly looking at ContainerFS here.)
There are many cases of this where we are not attempting to track renames. For example, two independent processes using PLFS have no easy way of knowing when one process renames the other's file.
However, there is a hook for the PLFS FUSE daemon to catch this case: when FUSE renames a file it checks for to see if it is open. If so, it calls the Plfs_fd::renamefd(struct plfs_physpathinfo *ppip_to) call. That call should update all in-memory data structures to reflect the new filename. However, examining the code, it seems like not all filenames are updated.
Here is the current renamefd() for container:
Start with Container_OpenFile (COF) class and recursively look at its data members. The COF itself has a "string path" and canonical backend that renamefd() does update, but COF also has a WriteFile and Index structures that need to be updated.
The call to WriteFile setPhysPath only updates WriteFile bnode, containter_path, and canback. It does not update WriteFile's subdirpath, subdirback, or the "pahts" C++ map of filenames. WriteFile also points to its own Index structure for write indexing (if that has any filenames, it needs to be updated).
There isn't a call under renamefd() to the Index to update its path info. I'm also wondering about update the C++ map called "chunk_map" which has dropping filenames in it.
It seems like renamefd() needs to update more than it currently does in order to keep the Plfs_fd up to date.