Does Ralloc provide fail-safe atomicity and consistency

ZhangJiaQiao commented 3 years ago

During the management of APIs such as Alloc, Ralloc needs to modify multiple persistent metadata. How can it guarantee the atomicity and the data consistency of these APIs in Ralloc? Intuitively, it needs to be transactional as PMDK does, but the source code and paper do not mention this issue. I don't know whether I have lost some details.

qtcwt commented 2 years ago

I just saw the question today; sorry. Maybe you already found the answer, but I'll explain just in case.

The bottom line is, a persistent allocator doesn't need failure atomicity of updating its metadata in the entire malloc/free operation; this is also the key observation behind Ralloc. Instead, all we need to know to recover the allocator to a correct state is whether a memory block is still in use and if so, what its size is (Section 3 in paper). In Ralloc, the former is determined by the traversal from persistent roots during recovery, and a block's size is its superblock's size persisted (i.e., flushed and fenced) before any block in this superblock is returned by malloc.

More interestingly, any (transient) nonblocking object doesn't really need failure atomicity to enforce data consistency after power failures; it would be a overkill: as long as cache-lines are written back correctly, after recovery the data will eventually become consistent again. Think it this way: a half-way done operation before the power failure is just a long-stuck thread, while being nonblocking means others may help finish or abort that stuck thread.

You may find more details in our full paper Understanding and Optimizing Persistent Memory Allocation at ISMM'20.

Hope this helps.

ZhangJiaQiao commented 2 years ago

收到。

urcs-sync / ralloc

Does Ralloc provide fail-safe atomicity and consistency #17