Current NMF updates use a step of $\frac{-W^TA_{:j}}{W^TW}$ for updates of $H$ with Sequential Coordinate Descent (SCD).
We could avoid SCD by adding an additional term to the updates: $\frac{W^TWH{:j} - W^TA:j}{W^TW}$ for updates of $H$. The same logic applies to updates of $W$. Since we already have $a = W^TW$, the only additional operations we have are $aH_{:j}$ (which is very cheap) and the subtraction of the two terms in the numerator. This is likely to be faster, but it is unclear whether the convergence properties will be as excellent. Perhaps an adam optimizer would be applicable to this update, while it does not benefit ALS SCD.
Current NMF updates use a step of $\frac{-W^TA_{:j}}{W^TW}$ for updates of $H$ with Sequential Coordinate Descent (SCD).
We could avoid SCD by adding an additional term to the updates: $\frac{W^TWH{:j} - W^TA:j}{W^TW}$ for updates of $H$. The same logic applies to updates of $W$. Since we already have $a = W^TW$, the only additional operations we have are $aH_{:j}$ (which is very cheap) and the subtraction of the two terms in the numerator. This is likely to be faster, but it is unclear whether the convergence properties will be as excellent. Perhaps an adam optimizer would be applicable to this update, while it does not benefit ALS SCD.