Both of these updates were minimal changes for a bug fix
4.5.3.4:
Intel ID: 22018540857
Description:
22018540857: Getting resource busy while SRIOV VF resources are requested
Root Cause:
A race between iavf_watchdog_task() and iavf_adminq_task(), where in the
iavf_adminq_task() saves event data from the VIRTCHNL_EVENT_RESET_IMPENDING,
but then gets blocked waiting for the critical section lock while the
iavf_watchdog_task() handles the device reset. Once iavf_watchdog_task()
finally completes the reset, the iavf_adminq_task() continues executing and
processes the now-stale VIRTHCHNL_EVENT_RESET_IMPENDING message. This causes
the driver to set the reset state flag, and forces the watchdog into waiting
for the hardware reset to start. Since 3.7.81, the driver will stay waiting for
reset indefinitely until the next time that hardware reset occurs. This causes
most userspace interactions to report -EBUSY, and causes the driver to print
"Never saw reset" once every 5 seconds.
Resolution Notes:
Fix the locking in iavf_adminq_task() so that it acquires the critical section
lock before reading from the receive queue. With this change, the thread will
either execute before iavf_watchdog_task() or after it. If it does execute
after iavf_watchdog_task(), then it will not see the stale
VIRTCHNL_EVENT_RESET_IMPENDING. This is because the message will be cleared
when the hardware resets and the driver re-initializes the receive queue. Thus,
the iavf_adminq_task() will no longer report a stale reset event.
Testing Hints:
1) reduce the number of CPUs available by using isolcpus when booting to limit
to only 1 or 2 cores available for kernel tasks. (This matches our customer
setup)
2) create 32 VFs
3) spam changing port VLAN on all 32 VFs, turning it on, then off.
4) also spam reading statistics from ethtool on all 32 VFs
The ethtool stats spam is done in order to make sure the watchdog task is
executing rapidly, as otherwise it only operates once every 2 seconds when no
background tasks are executing. The race only manifests if the watchdog task is
already executing when the adminq task tries to acquire the critical section
lock.
Able to reproduce this easily (within 5-10 minutes) on 32 VFs using the
above setup. With the fix, no issue occurs.
4.5.3.2:
Intel ID: 22018832934
Description:
VF resets result in "Failed to init adminq: -53" errors, triggering hangs
Root Cause:
Do not call iavf_close in iavf_handle_hw_reset error handling path as it can
lead to double call of napi_disable, which leads to the deadlock in kernel.
Resolution Notes:
In this driver the issue was that iavf_close was waiting for
__IAVF_IN_CRITICAL_TASK bit while the iavf_watchdog_task which called all this
code already owns this bit. This lead to the deadlock reported by iavf_remove
(when triggered).
Fix this by calling iavf_disable_vf without running the clean up logic.
Testing Hints:
Loop(
ip link set vf_intf up
ip l set dev pf_intf vf 0 trust on;
ip l set dev pf_intf vf 0 vlan 333;
ip l set dev pf_intf vf 0 trust off;
ip l set dev pf_intf vf 0 vlan 310;
ip l set vf_intf down
sleep 0.1
)
Both of these updates were minimal changes for a bug fix
4.5.3.4: Intel ID: 22018540857
Description: 22018540857: Getting resource busy while SRIOV VF resources are requested
Root Cause: A race between iavf_watchdog_task() and iavf_adminq_task(), where in the iavf_adminq_task() saves event data from the VIRTCHNL_EVENT_RESET_IMPENDING, but then gets blocked waiting for the critical section lock while the iavf_watchdog_task() handles the device reset. Once iavf_watchdog_task() finally completes the reset, the iavf_adminq_task() continues executing and processes the now-stale VIRTHCHNL_EVENT_RESET_IMPENDING message. This causes the driver to set the reset state flag, and forces the watchdog into waiting for the hardware reset to start. Since 3.7.81, the driver will stay waiting for reset indefinitely until the next time that hardware reset occurs. This causes most userspace interactions to report -EBUSY, and causes the driver to print "Never saw reset" once every 5 seconds.
Resolution Notes: Fix the locking in iavf_adminq_task() so that it acquires the critical section lock before reading from the receive queue. With this change, the thread will either execute before iavf_watchdog_task() or after it. If it does execute after iavf_watchdog_task(), then it will not see the stale VIRTCHNL_EVENT_RESET_IMPENDING. This is because the message will be cleared when the hardware resets and the driver re-initializes the receive queue. Thus, the iavf_adminq_task() will no longer report a stale reset event.
Testing Hints: 1) reduce the number of CPUs available by using isolcpus when booting to limit to only 1 or 2 cores available for kernel tasks. (This matches our customer setup) 2) create 32 VFs 3) spam changing port VLAN on all 32 VFs, turning it on, then off. 4) also spam reading statistics from ethtool on all 32 VFs
The ethtool stats spam is done in order to make sure the watchdog task is executing rapidly, as otherwise it only operates once every 2 seconds when no background tasks are executing. The race only manifests if the watchdog task is already executing when the adminq task tries to acquire the critical section lock.
Able to reproduce this easily (within 5-10 minutes) on 32 VFs using the above setup. With the fix, no issue occurs.
4.5.3.2: Intel ID: 22018832934
Description: VF resets result in "Failed to init adminq: -53" errors, triggering hangs
Root Cause: Do not call iavf_close in iavf_handle_hw_reset error handling path as it can lead to double call of napi_disable, which leads to the deadlock in kernel.
Resolution Notes: In this driver the issue was that iavf_close was waiting for __IAVF_IN_CRITICAL_TASK bit while the iavf_watchdog_task which called all this code already owns this bit. This lead to the deadlock reported by iavf_remove (when triggered).
Fix this by calling iavf_disable_vf without running the clean up logic.
Testing Hints: Loop( ip link set vf_intf up ip l set dev pf_intf vf 0 trust on; ip l set dev pf_intf vf 0 vlan 333; ip l set dev pf_intf vf 0 trust off; ip l set dev pf_intf vf 0 vlan 310; ip l set vf_intf down sleep 0.1 )