yixuan / LBFGSpp

A header-only C++ library for L-BFGS and L-BFGS-B algorithms
https://lbfgspp.statr.me/
MIT License
532 stars 101 forks source link

Uninitialized matrices lead to undefined behaviour #8

Closed Svalorzen closed 4 years ago

Svalorzen commented 4 years ago

While playing with the example code you helped me fix in #7, I started to get randomly erratic behavior while changing lines of code completely unrelated to the optimizer. Sometimes by simply adding or removing a line of code I can consistently reproduce the optimizer going into a completely different direction, and subsequently throwing due to some unfulfilled condition.

Since in my experience these kind of problems are usually reserved for uninitialized values, I got curious and compiled my example in debug mode, and running it through valgrind --leak-check=full --track-origins=yes. Valgrind found many cases of conditional jumps being performed depending on uninitialized variables, which confirmed my suspicions.

The source seems to be the various matrix resizing operation in the reset() functions of the optimizer/matrices(BGFSMat)/etc. I'm not sure if you are aware, but when resizing, Eigen does not automatically zero the matrices. I have fixed the problems by calling, after each resize, setZero(), but I am not sure whether this is OK or if you are explicitly avoiding the initialization because you want to do it in another place in the code to be more efficient.

An example of the change I have made to LBFGSBSolver to fix the problem:

    inline void reset(int n)
    {
        const int m = m_param.m;
        m_bfgs.reset(n, m);
        m_xp.resize(n);
        m_xp.setZero();
        m_grad.resize(n);
        m_grad.setZero();
        m_gradp.resize(n);
        m_gradp.setZero();
        m_drt.resize(n);
        m_drt.setZero();
        if(m_param.past > 0) {
            m_fx.resize(m_param.past);
            m_fx.setZero();
        }
    }

I am attaching the report here in case you need it, as it is fairly long (~5000 lines). Please let me know if I can help you more.

valgrind_report.txt

yixuan commented 4 years ago

Interesting finding. I thought I had already taken into account the uninitialized values and avoided reading those values, but your report seems to indicate that some edge cases exist. I'll look into it later.

yixuan commented 4 years ago

Fixed in https://github.com/yixuan/LBFGSpp/commit/f047ef4586869855f00e72312e7b4d78d11694b1. Only one variable is critical.

Thanks for reporting!