adefazio / point-saga

Accelerated SAGA
MIT License
24 stars 12 forks source link

l1 regularisation #1

Open NikZak opened 2 years ago

NikZak commented 2 years ago

Thank you for a great, clean and well structured implementation.

Although it is said in the paper that pointSAGA is extendable to non-smooth functions it is not immediately clear how l1 regularisation can be applied in pointsaga. Should it be applied after the prox_f_i step as a prox_h operator as in SAGA?

adefazio commented 2 years ago

I'm glad you found the implementation useful! There has been follow up work by other researchers to add support for separate L1 regularizers to point-SAGA. The following paper is a good reference for how to do it: https://link.springer.com/article/10.1007/s10994-019-05785-3 It's essentially Douglas-Rachford splitting on top of point-SAGA.

NikZak commented 2 years ago

Thanks! Nice trick!

I was just looking into implementing SAGA in Rust and that's how I ended up here. A few questions arise if you don't mind me asking.

1) It seems to me that this Prox2SAGA is strictly better than SAGA. It works for sparse sets (not sure). It has l1 regularisation. Sorry, I would not want to offend in any sense, it stands on the shoulders of giants. Is my understanding correct? Could you confirm that this is applicable to all sparse problems and lagged update can be done? 2) Is it a best-in-class algorithm for at least some classes of problems (LogisticRegressopn for large sparse input with l1 regularisation)? It looks like SAGA with Katyusha momentum trick is head-to-head compared to PointSAGA and potentially shows better performance (and also applicable to sparse data and extendible with l1 regularizer). https://arxiv.org/pdf/1806.11048.pdf. ProxASAGA also looks promising and fit for sparse problem and l1. https://arxiv.org/pdf/1707.06468.pdf. But I am not an expert at all. Am I missing something big?