Open Sann5 opened 1 month ago
You are correct that this part of the code is probably the reason why coefficients for at most 5 alphas are returned.
However, this check is also part of the reference implementation in the R glmnet package: https://github.com/cran/glmnet/blob/ace3461223a8fe8a3e0c2f749d3d7e8e84e49ea8/src/glmnet3.f90#L4460
Hence, the code does have its purpose. The main question is where the two implementations deviate from each other. I would suggest to use an example where sksurv terminates early and compare the steps in glmnet and sksurv with a debugger.
Checklist
pytest passespytest (that were passing before) passesWhat does this implement/fix? Explain your changes In
sksurv/linear_model/src/coxnet/coxnet.h
we find thefit
method for theCoxnetSurvivalAnalysis
class. This fits a model for the desired regularization path. The current implementation includes an early stop condition which stops fiting a model for the given alphas based on a deviance ratio. However it is desirable sometimes to fit the whole path (using all the alphas) even if the performance does not change significanly across alphas (see #41 for a more detailed explanation of the behaviour). The obvios solution is to get rid of the early stop condition (which is what I have done here by commentitg it out), however I think the better solution would be to include a boolean flag in the class definition that specifies if the user wishes or not to use this early stopping functionality. Sadly I do not have the time to implement such a solution. Please feel free to use this PR as a starting point.