Closed qichenglao closed 5 years ago
@YingzhenLi, can you have a look?
Hi qichenglao,
You are right, it should be 'gen_%d_head', however:
In practice I would suggest a Bayesian treatment for the shared network only. In other words, in our experiments, we used VCL to train the shared network, while we used MLE (thus no KL penalty) for the private head network for each task. You can also train the head network by adding this KL penalty as well (with the q_{t-1} for the private network being standard Gaussian), but I would imagine it to work worse.
yeah, it eventually has the same effect as not computing the second part in KL_param function. Thanks a lot for your comments!
Thanks, @YingzhenLi and @qichenglao. Since the bug has been fixed, I will close the issue.
Hi, there is a typo here https://github.com/nvcuong/variational-continual-learning/blob/12e1883abc0309e7d7dc2e68bfd3590df3557de0/dgm/alg/onlinevi.py#L16 shouldn't it be if 'gen_%d_head' % task in var.name?