Sparse Confusion - Githubissues

Thanks for sharing the codes! I am confused about the L1 referred in the paper.

Since the optimizer used in main.py is SGD, and it can also be noticed that the weight_decay parameter in SGD is also used, should it not be equivalent to L2 ?

I also further read the BNupdate func but I still did not understand why it is L1 , cuz BNupdate just adds or subtracts a constant to each gama value?