Photrek / Coupled-VAE

Machine learning algorithms for complex systems leveraging the methods of nonlinear statistical coupling
3 stars 1 forks source link

Develop Coupled ELBO Probability Graphic #12

Open kenricnelson opened 3 years ago

kenricnelson commented 3 years ago

There is a need to expand the analysis shown in Fig 5 & 8 of the Coupled VAE paper. The Coupled VAE paper provided a graphic of the reconstruction loss by plotting a histogram of the loss per image and overlaying this with the Accuracy, Robustness, and Decisiveness geometric mean metrics. The analysis needs to add a plot for the latent layer divergence and to combine the divergence & reconstruction into the probabilities which represent ELBO metric.

The design for this analysis is being developed in the Mathematica file Generalized_ELBO.nb. There are three components:

  1. Probability histogram of reconstruction with overlay of generalized mean metrics
  2. Probability histrogram of divergence with overlay of generalized mean metrics
  3. Probability histogram of ELBO with overlay of generalized mean metrics
kenricnelson commented 3 years ago

I am currently working on determining if the divergence and ELBO histogram can be formed in a consistent manner. The reconstruction histogram was straightforward because each generated image is measured by a likelihood which gives the probabilities for the histogram. I'm seeking to formulate the equivalent for the divergence and the EBLO.

kenricnelson commented 3 years ago

Currently assigning this task to myself and Thistleton. Once the design is worked and tested, we can decide who to assign the Python integration.

kenricnelson commented 3 years ago

To achieve the Coupled ELBO Probability Graphic we will need to take a simplified approach relative to the current design of the coupled ELBO. The current ELBO adds the three components of the loss function. The probabilistic element of these three components are the posterior density, the prior density, and reconstruction likelihood. If the mathematics of the current coupled ELBO design were followed precisely, the graphical representation of this would require that the probabilities be combined using the coupled product; however, doing so would complicate the formation of one histogram over which the three metrics (Decisiveness, Accuracy, and Robustness) are overlaid.

To simplify the graphical representation, we will instead used the standard product (coupling = 0) for combining the probabilistic components. The generalized mean can than be computed for the two components (probability of divergence and probability of reconstruction) and than the combination will be the product of these two.

This updates to the nsc library to complete this should be minimal since we already have the generalized mean implemented and will not need the coupled product. The probability divergence histogram may be a closed-form computation since its based on the standard KL-divergence. The probability likelihood is available as a variable and is the graphic already produced in the Shichen, Li, et. al paper.

Given this design, I will assign this to Kevin Chen for implementation and he can work with Hong and John to decide who will do the initial coding. I will work closely with the team to clarify the specifics.

Kevin-Chen0 commented 3 years ago

I am tasking @hxyue1 to do the implementation of the Coupled ELBO, as this is related to his work regarding the integration of NSC to the loss function of VAE.

I will be working on the generalized mean graphics part for the VAE in the meantime.

kenricnelson commented 3 years ago

@Kevin-Chen0 @hxyue1 Here are the steps I recommend, as "Coupled ELBO Probability Graphic" is the application of the generalized mean to the probabilities representing the Coupled ELBO.

1) Reproduce the assessment of the likelihoods as presented in Cao's paper on the The Coupled VAE. This is the easiest step, as the probability likelihoods for each reconstructed image are readily available. Overlaying the histogram are the three generalized means. Robustness (-2/3 mean), Accuracy (0 mean), Decisiveness (1 mean). We may also want to display a metric for the power associated with the coupling parameter used for the training. This would be r = - 2 kappa/(1+d kappa).

The plotting of the histograms may be the most difficult part of this. Jingjing used R to make the plots; however, I think we should stick with python. @Thistleton, if necessary, can you work with Kevin on the plotting of the data.

2) Produce a histogram of the divergence for the simplest case: KL-divergence for Gaussian prior and posterior. This is closed form function provided by the VAE code. The probability is simply KL_div_prob = exp(-KL_divergence). Again, a probability is computed for each input datapoint and the results form a histogram. Again, the three metrics overlay the histogram.

3) Produce a histogram of the negELBO_probability. This is formed by multiplication of the first two: likelihood * KL_div_prob.

4) The next case to try is the coupled_divergence given Gaussian prior and posterior distributions. However; because we are going to keep the probabilities independent (ie they multiply) this case is actually computed identically to steps 1-3. That is we are going to use the standard (not coupled) KL_div_prob and this will be multiplied by the likelihood to form the negELBO_prob. So the analysis is the same but the results will vary as the coupling_entropy value is changed.

5) If we are able to complete experiments with the coupled distributions for the latent layer, then a new computation is required. Here we still have KL_div_prob = exp(-KL_divergence) but now the KL_divergence is between a prior and posterior of Coupled Gaussians. I examined these integrals last week; however, my first inspection suggests that they do not simplify to a closed form, so the numerical integration will be necessary for this step. This not really a problem, as the numerical computation is also necessary for the coupled_divergence of these integrals, so these experiments depend on completion of this capability anyway.

Let me know if there are any questions about these steps. This design of the assessment doesn't actually have much dependency on the work Hong and John are doing on the NSC library. I think the biggest issues are utilizing the Python plotting tools to make the histograms and coding the divergence and its translation into a probability.

Kevin-Chen0 commented 3 years ago

I have the following questions:

  1. Just to confirm, the implementation steps u mentioned about Coupled ELBO Probability Graphic is separate from the below Coupled ELBO formula below right, as the former is supposed to represent the later? coupled_elbo
  2. Regarding 1., or reproducing the assessment of the likelihoods as presented in the latest Cao's paper, is there anything that did there that we haven't already done in our last year's abstract? I am planning to take the VAE histogram code (also shown below) that we have used from last year to create this histogram, with the exception using the generalized_mean code that @jkclem has created in the nsc lib instead of _calculate_generalized_mean.
    def _display_histogram(self, overall_values):
        log_probability_values = self._calculate_generalized_mean(overall_values)
        # The histogram of the data
        fig, ax = plt.subplots(figsize=(24, 10))
        xtick_labels = ['1e-240', '1e-210', '1e-180', '1e-150', '1e-120', '1e-90', '1e-60', '1e-30', '1']
        ax.set_xticks([math.log(1e-240), math.log(1e-210), math.log(1e-180), math.log(1e-150), math.log(1e-120), math.log(1e-90), math.log(1e-60), math.log(1e-30), math.log(1)])
        ax.set_xticklabels(xtick_labels)
        plt.title('Histogram for Coupled VAE', fontdict = {'fontsize' : 40, 'weight': 'bold'})
        plt.xlabel('Probability of reconstructed image equivalent to original one.', fontdict = {'fontsize' : 40, 'weight': 'bold'})
        plt.ylabel('Frequency in logscale.', fontdict = {'fontsize' : 40, 'weight': 'bold'})
        plt.axvline(self.log_decisiveness, color='r', linestyle='dashed', linewidth=2)
        plt.axvline(self.log_accuracy, color='b', linestyle='dashed', linewidth=2)
        plt.axvline(self.log_robustness, color='g', linestyle='dashed', linewidth=2)
        plt.hist(log_probability_values, log=True, bins=100, facecolor='white', edgecolor='black')
        plt.savefig(f'{self.display_path}/histograms/hist_cd{self.coupling_dist}_cl{self.coupling_loss}.png')
        plt.show();

    def _calculate_generalized_mean(self, overall_values):
        probability_values = [math.exp(record) for record in overall_values]
        self.log_decisiveness = math.log(self._calculate_decisiveness(probability_values))
        self.log_accuracy = math.log(self._calculate_accuracy(probability_values))
        self.log_robustness = math.log(self._calculate_robustness(probability_values))
        log_probability_values = [math.log(record) for record in probability_values]
        return log_probability_values

    def _calculate_decisiveness(self, overall_values): 
        result = sum(overall_values) / float(len(overall_values))
        return result

    def _calculate_accuracy(self, overall_values) : 
        # Multiply elements one by one 
        result = 1
        for x in overall_values: 
            temp = x ** (1/float(len(overall_values))) 
            result = result * temp
        return result

    def _calculate_robustness(self, overall_values):
        result = 0
        for x in overall_values:
            result = result + (x ** (-2/3))
        result = (result / float(len(overall_values))) ** (-3/2)
        return result

For 2-5, I agree it is a good idea to start with reg KL-Div and loss (negELBO) then do coupled_divergence and coupled loss function once nsc code is ready and integrated with VAE.

kenricnelson commented 3 years ago

Histogram code from the abstract with Dan is a good place to start; hopefully, this is the version of the code where all the subtle formatting issues had been worked out.

The Mathematica file on the assessment metric needs to be updated, so this is no longer a good guide as you point out. In order to keep things simpler, all three probabilities will be multiplied together to form the probabilitity representing the negELBO. I'll work on revisions to this document.

Kevin-Chen0 commented 3 years ago

I believe I have completed 1-3. I'll have to wait until the integration of NSC in order to do 4-5 regarding the coupled_divergence and the coupled distributions.