OptimoJoe / Optizelle

Optizelle [op-tuh-zel] is an open source software library designed to solve general purpose nonlinear optimization problems.
www.optimojoe.com/products/optizelle
63 stars 13 forks source link

fixed crash caused by decrementing unsigned variable below 0 #54

Open todorangrg opened 7 years ago

todorangrg commented 7 years ago

Hello,

I have been getting consistent crashes and after some debugging I realized the problem was an iterator / index being decremented below 0. I am not really sure if the (dumb) fix I did is actually correct in the context of what the function should do, however it solved the crash for me.

josyoun commented 7 years ago

Thanks for finding this! Interesting. Alright, so that comes from the GMRES routine used for solving the augmented systems. Basically, when something goes awry, it's possible that the operator that represents the derivative of the equality constraints produces a NaN. It's in this subsection of the code where we detect that.

As such, I'm curious about a few things. First, was your linear operator, or preconditioner, throwing a NaN? Second, we allow restarting GMRES to limit the number of Krylov vectors that we need to store. Personally, I never really use this algorithmic feature, so it's probably not super well checked. That said, were you using restarted GMRES? Basically, what's augsys_rst_freq? Finally, do by any chance know which augmented system solve this occurred in? For example, the quasinormal step, equality multiplier solver, etc. If you have the first few levels of the stack trace, that'd tell us.

Anyway, I thought I'd caught most of these, but evidently not. Outside of general curiosity, and a bug fix, I'm trying to figure out exactly what happened, so that I can add another unit. We have something like 22 for the CG solver, but only 4 for GMRES, so we need to get some better coverage.

todorangrg commented 7 years ago

Hi,

Thanks for the detailed response. I really do not have time to do any tests / check those days, sorry (I have to go back in the branches of the projects to reach to the point where the crash was happening, I internally did the presented fix some time ago and left it like that). I will come back to you by the end of the week / beginning of next week with some answers.

Cheers

josyoun commented 7 years ago

No problem on the timing and thanks for providing the information. I really appreciate it.