Open zfsuser opened 2 months ago
For the first, this has nothing to do with scrub. Read back would have to happen similar to writing of l2arc_feed_thread, matching its speed. But since the logic of L2ARC is to write blocks that are soon to be evicted from ARC, attempt to re-read the old block will require even more ARC evictions, which will require even more L2ARC writes, that is a dead end. The critical point here we would need to decide is what blocks we consider important enough to read back and rewrite, even if they are not accessed for a while at this point, otherwise they could be rewritten into L2ARC in a normal way some time after. That information is not stored in persistent L2ARC metadata, so will be lost on reboot, but in ARC we have a counter of how many times each block in L2ARC was read since it was written there. But how would we compare stats from a blocks that are already in L2ARC for a while with blocks that only pretend to be written there first time? We still want L2ARC to store the most often used blocks even if the workload is changing. For example, users may always want L2ARC to cover blocks stored over the last week, but almost never access the older ones. Would we just keep the old blocks, we would penalize the new ones. Effectively we would need a MRU/MFU logic used for ARC, except that with L2ARC capacities we can not afford memory required for ghost states, since it would double ARC usage by L2ARC headers, which is already a problem.
Well, we do have the SSD sitting there...is there any reason we couldn't most-frequently-hit/most-recently-hit in a tiny section at the end, for data? (Metadata we probably just want to cache, period, if the drive is large enough...)
We can not keep all the old metadata, if not for other reasons, then just because current persistent L2ARC implementation during replay often resurrects blocks there were deleted earlier until the L2ARC is completely rotated, which will mean that L2ARC will gradually be filled by obsolete metadata blocks.
I mean, sure, we can't keep all of it, I just meant philosophically we didn't necessarily want to keep track of MRU/MFU for eviction of metadata in the same way because it doesn't have the same "if we didn't use it recently/often we don't care" properties.
The original feature description was missing context and was therefore misleading. I have added additional information to (hopefully) clarify the intention of this feature.
Describe the feature would like to see added to OpenZFS
Option to allow zpool scrub to "refresh" L2ARC stored pool metadata.
How will this feature improve OpenZFS?
Metadata (and data) stored in L2ARC (but not ARC) will typically be lost from the L2ARC due to being overwritten, depending on relative L2ARC size and churn-rate. This would create an issue when trying to keep the complete pool metadata in L2ARC, especially if also (a limited amount of MFU) data shall be stored (e.g. using #16343).
If a zpool scrub would forward the read pool metadata to the ARC, it should allow the l2arc_feed_thread to store most of the missing pool metadata in the L2ARC. This would also allow to trigger caching the (complete) pool metadata in the L2ARC.
It would be a workaround, but a cyclic scrub should be able to keep most of the pool metadata in the L2ARC.
Additional context
Component: SCRUB Component: ARC/L2ARC
A tunable would be needed to (de)activate the new feature. Default = old behaviour.
Further information (added 2024-08-07):
This feature is intended for pools:
The idea of this feature is that the pool metadata read during scrub is not only used for scrub, but also made available to ARC for storage in the L2ARC. A scheduled/cyclic scrub would therefore lead to a cyclic on-the-fly regeneration of the pool metadata in the L2ARC.
This feature would also enable the user to copy the pool metadata in the L2ARC, without having to e.g. delete the pool and send/receive it from a backup pool.