wolfe-pack / wolfe

Wolfe Language and Engine
https://wolfe-pack.github.io/wolfe
Apache License 2.0
135 stars 17 forks source link

NPE in spec #155

Closed riedelcastro closed 9 years ago

riedelcastro commented 9 years ago

https://github.com/wolfe-pack/wolfe/blob/dev/wolfe-core/src/test/scala/ml/wolfe/term/TermSpecs.scala#L288-288

rockt commented 9 years ago

I think the reason why this spec fails is that differentiate needs to be called on something like d.differentiate(Theta(IndexedSeq(vector(1), vector(2)),IndexedSeq(vector(3), vector(4)))) otherwise choice would pick a second relation vector that is not there. Now checking whether that was also the problem in the code I have.

riedelcastro commented 9 years ago

Hmm, would surprise me, but who knows. I really felt there was an issue with the "change" memory of settings when used in several contexts.

rockt commented 9 years ago

New spec here.

riedelcastro commented 9 years ago

I simplified the spec a bit. Was easier once you replace sampleShuffled with sampleSequential. Now my impression is that the problem is an empty sequence in the beginning (that's how the data was sampled in sampleShuffled earlier).

On Wed, Apr 22, 2015 at 11:48 PM, Tim Rocktäschel notifications@github.com wrote:

New spec here https://github.com/wolfe-pack/wolfe/blob/dev/wolfe-core/src/test/scala/ml/wolfe/term/TermSpecs.scala#L288 .

— Reply to this email directly or view it on GitHub https://github.com/wolfe-pack/wolfe/issues/155#issuecomment-95217747.

riedelcastro commented 9 years ago

I added an even simpler spec (that also doesn't depend on the first order sum, or sequences at all.

https://github.com/wolfe-pack/wolfe/blob/dev/wolfe-core/src/test/scala/ml/wolfe/term/TermSpecs.scala#L310-310

The core problem is that sometimes variables change and that at the same time there are terms that depend on the variables but haven't been evaluated after the last change. In the above example this is the case for the second "t.x dot t.y" term, and in particular the "t.x" (and t.y) sub term. In the first iteration the first t.x term will evaluated and becomes non-null. However, the second t.x term isn't called, so it's output is not updated and remains null. In the second iteration we pretend that the input hasn't changed, and hence no update is of the second t.x term is triggered. Hence it remains null.

On Thu, Apr 23, 2015 at 8:24 PM, Sebastian Riedel < sebastian.riedel@gmail.com> wrote:

I simplified the spec a bit. Was easier once you replace sampleShuffled with sampleSequential. Now my impression is that the problem is an empty sequence in the beginning (that's how the data was sampled in sampleShuffled earlier).

S

On Wed, Apr 22, 2015 at 11:48 PM, Tim Rocktäschel < notifications@github.com> wrote:

New spec here https://github.com/wolfe-pack/wolfe/blob/dev/wolfe-core/src/test/scala/ml/wolfe/term/TermSpecs.scala#L288 .

— Reply to this email directly or view it on GitHub https://github.com/wolfe-pack/wolfe/issues/155#issuecomment-95217747.

riedelcastro commented 9 years ago

I think the solution to this problem is to change the "record change" infrastructure. Instead of storing changes records in the settings, the setting(s) should have change listeners which can record changes. Then each term can listen to changes in the input settings, and those that don't get evaluated for a while record all changes in the meantime.

riedelcastro commented 9 years ago

I fixed this for now. Now terms keep their own record of how their input has changed, and if a term isn't evaluated for a long while, it's input change record just becomes longer. I think this may have broken other things (specs pass though), and initially we may see a slow down. But this was an important structural fix that would have caused us a lot of other trouble (some of which wouldn't even surface as NPEs, but wrong results).

If you find problems with this (optimization not working, exceptions etc.) please add corresponding specs.

rockt commented 9 years ago

Great, I don't get the NPE or any other exceptions anymore in my code! However, the optimization doesn't seem to be working: the objective value is going down (without converging) instead of up (and converging). I think in my code this is due to mf terms and lrl terms not being optimized jointly. I'll try to spec this. The second thing I noticed is that the optimization time went up dramatically. Model F doesn't finish the first epoch within 10min.