NVlabs / MaskLLM

[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
https://vainf.github.io/maskllm-project-page
Other
101 stars 11 forks source link

Question about Weight Regularization #1

Closed CoderChen01 closed 3 weeks ago

CoderChen01 commented 3 weeks ago

Excellent work! I have a question regarding weight regularization. Is it possible for negative loss values to appear during the training process?

VainF commented 3 weeks ago

Hi @CoderChen01, in our experiments, we did not observe any negative loss. This may be due to the relatively small coefficients used for weight regularization. But it's still possible to see negative loss if you use a very large regularization strength.