Method "set_data" of "DataDependent" loss functions yield to keep references on X an y blocs. This may cause memory leak issues. Indeed when you call fit on a model, you expect it to store parameters but not the whole input dataset. So I suggest that the "set_data" method computes only the parameters that depend only on the data for example the Lipschitz constant. This also mean that f and grad should accept X and y as parameters.
Method "set_data" of "DataDependent" loss functions yield to keep references on X an y blocs. This may cause memory leak issues. Indeed when you call fit on a model, you expect it to store parameters but not the whole input dataset. So I suggest that the "set_data" method computes only the parameters that depend only on the data for example the Lipschitz constant. This also mean that f and grad should accept X and y as parameters.