Closed nanxstats closed 7 years ago
Interesting example. I have several thoughts on this.
lambda.min=0.1
rather than increase max.iter
. The data simply does not support reliable estimation for lambdas smaller than this.max.iter
; sure, in this case, it's still fast for max.iter=5000
, but for high-dimensional data, this is likely to result in enormous amounts of time being wasted simply to obtain meaningless estimates as in (2) above.Hi Patrick, thanks for the very concise analysis on this dataset. I totally agree with your point 2 particularly --- it is important to do "the right thing" and give users appropriate warnings instead of pretending everything is going well, so keeping max.iter = 1000
sounds reasonable to me.
4 raised a very good point on the warning itself. I could see that changing the checking strategy is a possible solution, or maybe just printing out some more informative messages about the issue, so users could get some clues on what actually happened and further dig into their data.
In some cases,
ncvsurv()
could be "harder" to converge under the new RMSD criterion.For example:
The 1st and 3rd fit used to converge under the old criterion.
This PR increased the default value of
max.iter
from 1000 to 5000, which could give a better chance for convergence under the new criterion. Since the iterations here are really fast, this change will not likely to have a significant performance impact.