We need to monitor the progress or stability of training right after the end of each epochs. We can consider both quantitative and qualitative evaluations.
Quantitative metric
[ ] ~KL divergence~
~(Memory caution?) Need to store MCMC samples for each 'epoch'~
~Only can be used for simulation (We do not know true latent distribution for real data)~
KL is meaningless and hard to evaluate in ALMOND
[ ] Reconstruction error
Efficiency (Train ALMOND after partial train VAE: Train VAE 50epochs and compare VAE and ALMOND; Is ALMOND efficient than VAE?)
Accuracy (Train ALMOND after fully train VAE: Train VAE more than 200epochs and train ALMOND; Does ALMOND overcome the limitation of ELBO?)
After Training
We need to compare the result depending on the method used to infer latent distribution (Variational Inference, Sampling and Message Passing). Although it is well known fact that sampling algorithm and message passing algorithm shows better result for inferring true latent distribution compared to variational inference, the effect of inferring method on the result is not explicitly studied until now. We can consider both quantitative and qualitative evaluations.
Quantitative metric
[ ] Reconstruction error
MSE for single cell data
FID for image data
Imputation for recommend system
[ ] Clustering result
Choose and use standard clustering method with latent variables
Calculate Adjusted Random Index and Adjusted Mutual information with true label (Only when label is available)
During Training
We need to monitor the progress or stability of training right after the end of each
epoch
s. We can consider both quantitative and qualitative evaluations.Quantitative metric
After Training
We need to compare the result depending on the method used to infer latent distribution (Variational Inference, Sampling and Message Passing). Although it is well known fact that sampling algorithm and message passing algorithm shows better result for inferring true latent distribution compared to variational inference, the effect of inferring method on the result is not explicitly studied until now. We can consider both quantitative and qualitative evaluations.
Quantitative metric
Qualitative metric