Closed benmarchi closed 3 years ago
The callback should run fine if the number of iterations is <= 0
: https://github.com/mlpack/ensmallen/blob/cdfae670da608ef8498dee1e649f79f1d3279ef9/include/ensmallen_bits/callbacks/report.hpp#L120
Let's see if I can reproduce the issue somehow.
There definitely could be something wrong on my end as well. I'll continue to play around to see if I can nail down the issue.
I was able to create an example that consistently crashes when generating a report. The example calls ensmallen from R, so I don't know if that is part of the issue.
Thank you for the clear report with how to reproduce! In order to reproduce specifically from C++, I adapted the code you wrote a little:
```#include <ensmallen.hpp>
class GLM
{
public:
// Construct the object with the given the design~
// matrix and responses.
GLM(const arma::mat& X,
const arma::colvec& y,
const arma::colvec& weights,
const arma::colvec& offset) :
X(X), y(y), weights(weights), offset(offset) { }
// Return the objective function for model parameters beta.
double EvaluateWithGradient(const arma::mat& beta, arma::mat& g) {
const arma::colvec eta = X.t() * beta + offset;
const arma::colvec mu = arma::exp(eta);
const arma::colvec diff = (y - mu);
g = -2.0 * X * (weights % diff % arma::exp(eta));
return arma::accu(weights % arma::square(diff));
}
private:
const arma::mat& X;
const arma::colvec& y;
const arma::colvec& weights;
const arma::colvec& offset;
};
// [[Rcpp::export]]
arma::mat log_glm_lbfgs(const arma::mat& X, const arma::colvec& y, const arma::colvec& weights, const arma::colvec& offset, const int& lineIter) {
GLM glm(X, y, weights, offset);
ens::L_BFGS lbfgs;
lbfgs.MaxIterations() = 100;
lbfgs.MaxLineSearchTrials() = lineIter;
arma::mat beta(X.n_rows, 1, arma::fill::zeros);
lbfgs.Optimize(glm, beta, ens::Report(0.1));
return beta;
}
// [[Rcpp::export]]
arma::mat log_glm_lbfgs_noLog(const arma::mat& X, const arma::colvec& y, const arma::colvec& weights, const arma::colvec& offset, const int& lineIter) {
GLM glm(X, y, weights, offset);
ens::L_BFGS lbfgs;
lbfgs.MaxIterations() = 100;
lbfgs.MaxLineSearchTrials() = lineIter;
arma::mat beta(X.n_rows, 1, arma::fill::zeros);
lbfgs.Optimize(glm, beta);
return beta;
}
int main()
{
arma::mat X(10, 1000, arma::fill::randn);
// In the example rows 1 through 4 have a binomial distribution, but I don't
// really feel like implementing that in Armadillo...
arma::vec beta(10, arma::fill::randn);
beta *= 0.5;
arma::vec off = arma::vec(1000, arma::fill::randn);
// Not sure exactly what this is in the example.
arma::vec y = arma::vec(1000, arma::fill::randn);
arma::vec yLog = arma::exp(y);
arma::vec weight = arma::ones<arma::vec>(1000);
arma::vec result = log_glm_lbfgs_noLog(X, yLog, weight, off, 10);
std::cout << "result: " << std::endl << result;
arma::vec result2 = log_glm_lbfgs(X, yLog, weight, off, 10);
std::cout << "result2: " << std::endl << result2;
}
Now I actually am not R-literate, so I can't really parse exactly how you made X and y, but I at least made them have the right shape (although I used a column-major representation not a row-major representation).
But this seems to work just fine, and it prints everything I ask:
result:
-0.1362
0.0858
-0.0552
-0.0052
0.0553
-0.0061
-0.0465
-0.0130
-0.0914
-0.0319
Optimization Report
--------------------------------------------------------------------------------
Initial coordinates:
0.0000 0.0000 0.0000 ... 0.0000
Final coordinates:
-0.1362 0.0858 -0.0552 ... -0.0319
iter loss loss change |gradient| total time
0 9247.567 0.000 3166.043 0.000
1 9132.302 115.265 582.953 0.000
2 9126.114 6.188 302.031 0.000
3 9123.418 2.696 98.225 0.001
4 9123.342 0.076 66.239 0.001
5 9123.291 0.051 5.527 0.001
6 9123.290 0.001 2.492 0.001
7 9123.290 0.000 0.471 0.001
8 9123.290 0.000 0.145 0.001
9 9123.290 0.000 0.113 0.001
10 9123.290 0.000 0.013 0.001
11 9123.290 0.000 0.006 0.001
12 9123.290 0.000 0.001 0.001
--------------------------------------------------------------------------------
Version:
ensmallen: 2.15.1 (Why Can't I Manage To Grow Any Plants?)
armadillo: 8.400.0 (Entropy Bandit)
Function:
Coordinates rows: 10
Coordinates columns: 1
Loss:
Initial 9247.567
Final 9123.290
Change 124.277
Optimizer:
Iterations: 13
Coordinates max. norm: 3166.043
Evaluate calls: 18
Gradient calls: 18
Time (in seconds): 0.001
result2:
-0.1362
0.0858
-0.0552
-0.0052
0.0553
-0.0061
-0.0465
-0.0130
-0.0914
-0.0319
So I'm not sure exactly what's going on. I'll see soon if I can make it fail from R too.
I'm glad that you were able to reproduce the issue. The code I shared is a pretty significant simplification of the actual problem I'm trying to solve, but having the X
matrix contain some, but not all, binary columns is important to the problem and seems to be contributing to the crashing behavior. I generated 'y' so that it was strictly positive (such that the objective and gradient are defined for all values in y
).
Please let me know if there is anything else I can do to help figure out what is going on.
P.S. Why did you switch to column major? Are ensumallen and armadillo more efficient when the problem is structured like this?
Sorry, maybe I wrote badly---I was not able to reproduce the issue.
Can you describe how the X
matrix is generated? I'm sure I can implement Armadillo code to do it, but the problem is that I can't understand the R that currently does it. :)
I switched to a column-major representation because that's the internal representation that Armadillo uses (which in turn was chosen because BLAS and LAPACK in FORTRAN expect column-major ordering). So, e.g., if you have an algorithm that is iterating over points, it would be best for Armadillo's efficiency if each individual point corresponded to a column. Now... there are some algorithms for which the most efficient representation is with each dimension as a column, but at least from my perspective these seem to be in the minority.
In any case, it depends a lot on the size of the problem whether you'll see any speedup from making that change. If the problem is small, the speedup would likely be small too. (But again, depends on the algorithm.)
I was able to reproduce this from R today. Still trying to narrow down what the cause is. (Also, I can just dump the data from R so no need to rewrite the data creation in C++.)
Hey @benmarchi, I got to the bottom of the issue and opened #252 to fix it. Basically, L-BFGS is not taking any steps at all during optimization. Maybe the gradient is implemented incorrectly? Here's the "corrected" Report
output:
Optimization Report
--------------------------------------------------------------------------------
Initial coordinates:
0.0000 0.0000 0.0000 ... 0.0000
Final coordinates:
0.6530 0.0543 0.0527 ... -0.5212
iter loss loss change |gradient| total time
--------------------------------------------------------------------------------
Version:
ensmallen: 2.15.1 (Why Can't I Manage To Grow Any Plants?)
armadillo: 9.900.1 (Nocturnal Misbehaviour)
Function:
Coordinates rows: 10
Coordinates columns: 1
Loss:
Initial 2604569365.849
Final 2604569365.849
Change 0.000
Optimizer:
Iterations: 0 (No steps taken! Did optimization fail?)
Evaluate calls: 11
Gradient calls: 11
Time (in seconds): 0.011
Great! Thanks for your help in tracking down the issue. I can definitely check the form of the gradient, but I think the proposed solution is nice in that it provides some indication that something may be wrong with the optimization.
Alright just merged - https://github.com/mlpack/ensmallen/pull/252.
@zoq / @rcurtin any chance of a new point release with the fix being released? It just missed the 2.16.0 release.
@zoq / @rcurtin any chance of a new point release with the fix being released? It just missed the 2.16.0 release.
Good idea, I can open a PR.
Let's fix https://github.com/mlpack/ensmallen/issues/253 as well.
I'm having a lot of trouble reproducing #253. Maybe can we merge #258 and do a release then, and perhaps a fix for #253 can come later. :+1:
I'm having a lot of trouble reproducing #253. Maybe can we merge #258 and do a release then, and perhaps a fix for #253 can come later.
Let's see if I can reproduce the issue on one of my systems.
Issue description
I'm using ensmallen via rcppEnsmallen. I've noticed this issue while using the L-BFGS solver. The issue is that the optimization process crashes my R session when the optimization quits without taking any steps while using the built-in report callback (likely due to bad initial parameter estimates). The optimizer does not crash the R session when the report callback is not used. The crash occurs while printing the report, which leads me to believe that there is something within the callback causing the issue. My guess is that the callback assumes that the number of iterations is greater than 0, which would not be the case when the optimizer fails to take a step.
Your environment
Steps to reproduce
Unfortunately, I don't have a reproducible example. My environment is also a bit locked, so I can't readily test any fixes.
Expected behavior
Report to print with full details.
Actual behavior
Example bad report output with R session crash: