Open arun-kv opened 4 years ago
@lundman We may have to consider splitting arc_abd_move_thr_fini() https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/module/zfs/arc.c#L9412
into two, where we do
mutex_enter(&arc_abd_move_thr_lock);
cv_signal(&arc_abd_move_thr_cv);
arc_abd_move_thr_exit = 1;
while (arc_abd_move_thr_exit != 0)
cv_wait(&arc_abd_move_thr_cv, &arc_abd_move_thr_lock);
mutex_exit(&arc_abd_move_thr_lock);
in the first and
mutex_destroy(&arc_abd_move_thr_lock);
cv_destroy(&arc_abd_move_thr_cv);
in the second.
The second part should be delayed till the "end" of driver unload. This allows other threads to inspect arc_abd_move_thr_exit and lock/unlock the synchronization primitive protecting it.
To be honest, the abd_move work came from osx, where I have already removed it - when merging with the new port. It was decided that if the need comes up again, we'll re-implement it, since the way abd is setup is a little different. At this point, I'm inclined to remove from ZFSin as well.
Nice catch though!
Thanks @lundman. We will take this change in our environment and see how it goes.
In arcfini we signal the arcabd_move_thread to exit first, and then destroy arc_abd_move_thr_cv. https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/module/zfs/arc.c#L7837 and then we signal the arc_reclaim_thread, which further tries to signal the arc_abd_move_thr_cv which is already destroyed. https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/module/zfs/arc.c#L5183 This leads to occasional panic during uninstallation.