Consider using explicit X-vs-ReLU(L) loss instead of Z-vs-L loss for all algorithms

Which of these best describes your feature request:

[ ] Library usability improvement
[x] Performance improvement (speed, accuracy)
[ ] New methodology or algorithm

Describe how the new feature would improve the library: As pointed out in PR #18, most of the kernel methods compute the loss for each iteration by taking the norm of the difference between the utility matrix Z and the low-rank-reconstruction L. This serves as a proxy for the true optimization target, which is the loss between the post-ReLU low-rank candidate matrix max(0.0, L) and the input sparse matrix X. The proxy is fine in most cases, because by construction Z's only positive values are those of the original sparse matrix X.

However, momentum-based methods (such as the Aggressive Momentum method implemented in #18) risk breaking that property: the momentum effect may create positive values in Z that do not match X.

This raises the question of whether it would be desirable in principle to compute the loss based on the actual reconstruction instead of the proxy, even when the momentum is not an issue.

Describe the solution you'd like If we decide to make this change, it would essentially involve promoting the model_free_util.reconstruct_X_from_L function introduced in PR #18 to a more general location, and rewriting the compute_loss function in loss_util.py so that it always applies the reconstruction and uses the original sparse matrix, rather than the proxy.

Describe alternatives you've considered The clear alternative is to leave the existing code unchanged, and continue using the proxy Z - L where it's currently used.

Additional context Before making a change of this nature more generally, I would like to discuss with scientific stakeholders, to see if they agree there's a motivation to make the change; and also profile the cost of the change under various matrix sizes, to get a more quantitative sense of the performance impact under various scenarios.

flatironinstitute / nomad

Consider using explicit X-vs-ReLU(L) loss instead of Z-vs-L loss for all algorithms #24