Open cmurphycode opened 3 years ago
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
The prefetch and probably other related code was actually modified number of times after the mentioned 0.7.2. For example 891568c9907b3631f99a1079344bddf62ca70f56 from Mar 20 2021 started blocking prefetch reads on second read when all data are in the ARC, so the second read it should be accounted only as a demand hits.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
System information
Describe the problem you're observing
arcstats show a suspiciously large amount of prefetch misses for one-time sequential read workflows.
Describe how to reproduce the problem
NOTE: i logged this against the 0.7.12 version, but I don't see any code difference in this area. I observed this accounting issue during a particular IO pattern I was running, but can simplify to the following:
running arcstat while this test happens reveals what I mean. I left a one-sample gap in between the two runs:
Notes: the first time we read the data, we have to do the IO to the vdevs, so it's slower. That's fine, but we should be getting nice prefetch benefits too. We can see in iostat that zfs is indeed trying quite hard to prefetch- note the avgqu-sz on the zd32 compared to the sdd (vdev).
(For the second run through, there is no activity to the backend vdev because the arcstat is large enough to contain the entire dataset)
So this is working fine, but the accounting seems wrong. In the first test, we should be getting at least some prefetch hits, but we get essentially zero. We are getting 100% demand hit rate, which should likely be near-zero. In the second test, we get 100% prefetch hit rate, and we get some demand hits too. I think the second result makes sense as it just depends whether a prefetch got fired off before the demand read could be satisfied from the arc.
The first result, where no credit is given for prefetches, may be due to the following code: https://github.com/openzfs/zfs/blob/zfs-2.1-release/module/zfs/arc.c#L5577 https://github.com/openzfs/zfs/blob/zfs-2.1-release/module/zfs/arc.c#L6070
Note that any time we are going to increment the prefetch/demand hit/miss data/metadata counters, we have called arc_access immediately before. By my simplistic reading of https://github.com/openzfs/zfs/blob/zfs-2.1-release/module/zfs/arc.c#L5413 , in arc_access, we are clearing the prefetch flag from the arc entry in the MRU case. This seems to match with what I'm seeing -- the first time through, a prefetch buffer will be found in MRU and accounted for as regular MRU demand hit. The second time through, I guess the buffer will be in the MFU, where we do not clear the prefetch flag. This means the arc_read (if it is a prefetch) can correctly increment the prefetch hit counter.
Thoughts? I realize that true accounting of "credit" for prefetch vs demand is not necessarily a 100% clear distinction, but the way this works here definitely was not my expectation.
Thank you!