I believe the ELBO is maximized using SGD in KLQP.py
First: By default only one data is used to compute the estimator of the full gradient. Does n_samples argument allows to fix the number of gradients to calculate in order to estimate this noisy gradient?
Second: Are you considering using SAG (see Le Roux et al, 2013) or SAGA (see Defazio et al, 2014) updates instead of SGD?
I believe the ELBO is maximized using SGD in KLQP.py First: By default only one data is used to compute the estimator of the full gradient. Does n_samples argument allows to fix the number of gradients to calculate in order to estimate this noisy gradient? Second: Are you considering using SAG (see Le Roux et al, 2013) or SAGA (see Defazio et al, 2014) updates instead of SGD?