xzackli / Bolt.jl

differentiable boltzmann code
MIT License
42 stars 5 forks source link

early nn stuff #89

Open jmsull opened 1 year ago

jmsull commented 1 year ago

Round 1 of very simple Adam opt plots on cdm at fixed background:

delta and v deltac_learning_v1_multnoise0 1_Adam80_1 0

vc_learning_v1_multnoise0 1_Adam80_1 0

reconstructed delta', v' deltacprime_learning_v1_multnoise0 1_Adam80_1 0

vcprime_learning_v1_multnoise0 1_Adam80_1 0

jmsull commented 1 year ago

v and v' look pretty bad - lots of room to improve

Sorry title label on second to last plot is wrong, should say delta'

jmsull commented 1 year ago

BTW this is really long mode, $k \sim 0.003$

jmsull commented 1 year ago

Now training with 50 iters of Adam with $\eta=1$, followed by 50 with $\eta=0.1$, 20 with $\eta=0.01$, and 10 iters of BFGS (default hyperparameters) - it looks a little better, especially in the solutions, maybe not so much in the reconstructions of $u'$. deltac_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

vc_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

Reconstruction: deltacprime_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

vcprime_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

jmsull commented 1 year ago

The loss curve:

loss_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

It looks like maybe bfgs is starting to just turn down? But the BFGS iters are super expensive (I suppose due to Hessian approximation, even with forward diff, which I assume it is using for that). We should perhaps run this on something with more oomph than my laptop...

jmsull commented 1 year ago

Some other observations:

jmsull commented 1 year ago

Another thing I'm eager to try is adding more data and batching over k, which will be closer to what we want to do eventually...