cxl-micron-reskit / famfs

This is the user space repo for famfs, the fabric-attached memory file system
Apache License 2.0
31 stars 9 forks source link

In the famfs file system, after deleting a file using the rm or unlink command or interface, the space occupied by that file cannot be reclaimed and reused for creating new files. #61

Closed lordiscat closed 3 months ago

lordiscat commented 3 months ago

In the famfs file system, after deleting a file using the rm or unlink command or interface, the space occupied by that file cannot be reclaimed and reused for creating new files. And when the famfs file system and the cxl memory are full of file, How can I reuse the cxl memory except power off?

jagalactic commented 3 months ago

This is working as currently designed, although the documentation could use improvement.

Using the normal rm command on a famfs file is invalid, but famfs does not (yet) have a way to prevent it. So removing a famfs file leaves file system in an invalid state. You can see this with the famfs check command (which will notice files that are in the metadata log but don't exist due to rm), and you can restore the removed files via either famfs logplay or unmounting and remounting - in which case the removed file will come back.

The architectural problem with rm is that a core design point of famfs is toleration of client nodes with a stale view of metadata (i.e. clients that are behind on logplay). The current solution is not to remove files and not to re-use memory -- because that avoids the risk that one client sees a file that it doesn't know has been deleted, and another node sees a new file that re-uses the same memory.

Supporting delete isn't a hard problem from a log-structured-file-system standpoint, but managing the metadata consistency across a scale-out group of nodes with a shared log in a generalized way is not simple -- and is not in the near-term plan.

We are discussing the possibility of adding famfs rm in a way that particular use cases can use validly by applying some constraints. But that will likely require that use cases follow some (TBD) rules. The bottom line is that we probably need to discuss your use case in order to assess how it impacts potential plans to support famfs rm.

In the mean time, most of our users are unmounting and re-creating famfs file systems in order to clear them and start over.

Famfs file systems are largely used as a shared memory repository for data sets that need to be shared as memory mappable files (quite often read-only once they have been published). When the data is no longer needed, the famfs file system is destroyed and the memory is re-used.

This pattern will become a lot cleaner when CXL 3.1 dynamic capacity devices (DCDs) are available. Memory can be allocated as shareable "tagged capacity" which will appear as a dax device identified by a uuid. When you're finished with a famfs file system, you would deallocate the tagged capacity by uuid, and it would disappear from all the hosts that can see it - and the capacity will be usable for future allocations...