Closed raj-kumar-soni closed 3 years ago
Looks like this will fix #299
Ah yes - the logic for setting zfs_arc_max
is no longer run, but the reason it was taken out, is that ksp
when called is for osx_kstat_update()
is a osx_kstat_t osx_kstat
, then we pass that to arc_update()
as if it is kstat_t *arc_ksp
. Quite naughty. A quick fix would be to change to:
arc_kstat_update(arc_ksp, rw);
arc_kstat_update_osx(ksp, rw);
Ah yes - the logic for setting
zfs_arc_max
is no longer run, but the reason it was taken out, is thatksp
when called is forosx_kstat_update()
is aosx_kstat_t osx_kstat
, then we pass that toarc_update()
as if it iskstat_t *arc_ksp
. Quite naughty. A quick fix would be to change to:arc_kstat_update(arc_ksp, rw); arc_kstat_update_osx(ksp, rw);
With above changes (using arc_ksp
instead of ksp
in arc_kstat_update()
) not able to install ZFSin as arc_ksp
initialize after this.
It crashes with stack trace:-
ZFSin!arc_kstat_update+0xd [C:\ZFS-vt-code\ZFSin\ZFSin\zfs\module\zfs\arc.c @ 7329] ZFSin!osx_kstat_update+0x5a7 [C:\ZFS-vt-code\ZFSin\ZFSin\zfs\module\zfs\zfs_kstat_windows.c @ 430] ZFSin!kstat_osx_init+0xa0 [C:\ZFS-vt-code\ZFSin\ZFSin\zfs\module\zfs\zfs_kstat_windows.c @ 625] ZFSin!DriverEntry+0x2e [C:\ZFS-vt-code\ZFSin\ZFSin\driver.c @ 75] ZFSin!FxDriverEntryWorker+0xa9 [d:\5359\minkernel\wdf\framework\kmdf\src\dynamic\stub\stub.cpp @ 287] nt!IopLoadDriver+0x4bd nt!IopLoadUnloadDriver+0x4a nt!ExpWorkerThread+0x16a nt!PspSystemThreadStartup+0x55 nt!KiStartSystemThread+0x1c
Yeah, we would need to find a solution. Although, the zfs_arc_max and friends are updated in arc_kstat_update_osx(), perhaps the answer is to just bring that one back.
@lundman Is something wrong with this PR? Or the build? Please let me know how we can fix - "All checks have failed".
If nothing is wrong with the PR, can this change be approved?
Ah yes, it is wrong and I was expecting an update. But it might be easier if I update it? We do need to set the values, this is true, but the old code passes the wrong struct to arc update - its pure luck that struct is big enough to fit all the writes.
But it might be easier if I update it?
@lundman It would be wonderful if you could provide the update. We will be happy to test it and update you with our findings. We don't have enough awareness about that part of code to fix it ourselves in the near term.
It comes down to:
static int osx_kstat_update(kstat_t *ksp, int rw)
which expects that ksp to be:
typedef struct osx_kstat {
kstat_named_t spa_version;
kstat_named_t zpl_version;
and that update function does indeed update those members. I believe this is correct (hurrah), but then we call
arc_kstat_update(ksp, rw);
arc_kstat_update_osx(ksp, rw);
using the same ksp
.
In this case:
arc_kstat_update(kstat_t *ksp, int rw)
expects the struct
typedef struct arc_stats {
kstat_named_t arcstat_hits;
kstat_named_t arcstat_misses;
kstat_named_t arcstat_demand_data_hits;
So when arc_kstat_update()
updates arcstat_hits
it is actually overwriting spa_version
! Yikes.
Later on, kstat calls arc_kstat_update()
with the correct ksp
- so values are put back to real ones. You'd have to be pretty lucky with timing to "see" it running kstat, since the calls are right after each other. But really, we are writing ints, and strings, over struct members that are neither, then writing the correct ones back.
It comes down to:
static int osx_kstat_update(kstat_t *ksp, int rw)
which expects that ksp to be:typedef struct osx_kstat { kstat_named_t spa_version; kstat_named_t zpl_version;
and that update function does indeed update those members. I believe this is correct (hurrah), but then we call
arc_kstat_update(ksp, rw); arc_kstat_update_osx(ksp, rw);
using the same
ksp
.In this case:
arc_kstat_update(kstat_t *ksp, int rw)
expects the structtypedef struct arc_stats { kstat_named_t arcstat_hits; kstat_named_t arcstat_misses; kstat_named_t arcstat_demand_data_hits;
So when
arc_kstat_update()
updatesarcstat_hits
it is actually overwritingspa_version
! Yikes.Later on, kstat calls
arc_kstat_update()
with the correctksp
- so values are put back to real ones. You'd have to be pretty lucky with timing to "see" it running kstat, since the calls are right after each other. But really, we are writing ints, and strings, over struct members that are neither, then writing the correct ones back.
Hmm, interesting. Will you be able to submit the fix sometime soon? Say in a week? If not, I will have someone take a look at this thread and raise a PR. Please do let me know.
Yeah, I'll push something tomorrow
Yeah, I'll push something tomorrow
@lundman Are there more changes in the pipeline? With this commit alone, we still don't see the arc_max limit being honored.
Total RAM: 128 GB ZFSin memory usage : 63.4 GB zfs_arc_max : 15 GB
Through other commits I think we have corrected this. Please re-open if issues remain.
Through other commits I think we have corrected this. Please re-open if issues remain.
Thanks, @lundman. Yes, your latest checkins do honor the arc_max limit. The arc_max limit does get violated for a brief period but then the arc usage falls below the arc_max within a few seconds.
This patch partially reverts commit 88d37b7a1ae78ae36bfac16e7602885f54b266cc The issue was that the os_mem_alloc was keeping on increasing when IO workload was running, not honouring zfs_arc_max and zfs_arc_meta_limit.