Open DesmondYuan opened 3 years ago
Curated models are roughly harder to train. Probably because of the network degree which is different from synthetic models. Adding # of conditions or decrease alpha
to 0.2
helps.
n_mu=5
and n_mu=1
n_mu=2
at leastn_mu=19
and n_mu=2
at least (weight_decay
:1e-12
->1e-8
->1e-6
->1e-4
)All with train/valid/test=100/10/10
and ntotal=5 (ts=0:4:20)
Adding a schedule for increasing weight_decay
helps too
'can be trained' as in the MAE<=1e-3
and parameter strongly correlated to ground truth
Synthetic CY trained with the config and new network file in commit https://github.com/jiweiqi/CellBox.jl/commit/95830d2dd0a5bbd5526c095becc58bf46a02ca41
alpha
is set to 1.0
. n_mu=2
and n_exp_train=20
. There is still a nice tendency to oscillate :)
Bifurcating configs in commit 650b9cf698c34e18ebc0206940b37525b3ec8379
alpha
is set to 0.2
. For optimal parameter learning, ntotal
and n_exp_train
need to be slightly bigger than the given configs.
Env created for testing those models here 881527314fe1b49eef92b50ccd6a9869d48468ef
I suggest we split the tasks into 3. A quick look suggests that cyclic and bifurcation networks are relatively harder to train.