Closed RyanTang1 closed 6 years ago
It can push them very close to zeros, but cannot exactly fix them as zeros because of the always-present fluctuating weight updates (from the gradients of -1 or 1). I suspect L1 norm without thresholding can push lots of weights almost zeros, you may remove small weights by thresholding after training, and it may not affect accuracy.
Hello, I'm trying to reproduce L1 regularization based on your implementation. But I found that you actually make a threshold even with L1 regularization. Doesn't L1 produce sparse matrix even without threshold? I do found out that if I didn't add a threshold to my L1 regularization, I did not produce a sparse matrix. So I'm wondering if L1 regularization actually can't produce sparse matrix without threshold.