Negative phase in Boltzmann machine

gorkamunoz commented 7 years ago

I was trying to train a Boltzmann machine in the way they do in Benedettis et al. paper (arXiv:1609.02542). I am wondering how to calculate equations (9) and (10) correctly. My problem comes when calculating the ensemble average with respect to the distribution P(z) (the negative phase). Does this distribution evolve at the same rate as the coupling parameters J and h change through equations (7) and (8)? In order words, each time I update the couplings, I should apply this changes to the model lattice and calculate a new spin distribution?

apozas commented 7 years ago

I see what you are asking, but I do not understand the calculate a new spin distribution part. When training, the data that you know is precisely the spins (the zs), and at each step you update the couplings J and h. To have this done, you need to build the distributions corresponding to the data and the model.

The one coming from the data (the positive phase one) seems like it should be somehow "given", or at least this is what it seems to me. It looks like a distribution over all possible spin configurations that is only 1 for the configurations in the training set.

Then, the negative phase one is built in the following manner: for each step you get some Js and hs. Then you build you function E(z), where now the zs are the free parameters. Next, you build the distribution P(z) = 1/Z * exp(-beta E(z)) for some beta, and finally you sample from that distribution.

So for me it looks more difficult to really understand the positive phase distribution.

Maybe I didn't understand well your question, was this helpful?

gorkamunoz commented 7 years ago

Ok, so my problem comes when sampling from the distribution, as you indicate in the last paragraph. Maybe I am thinking too much in a 'physical' way, but for me, once I update the Js and the hs, I should let the system evolve in order to go to the equilibrium, and then, sample from the distribution of spins.

But this equilibration process seems quite ineffective, so I guess I am doing something wrong?

You can rephrase all my problems in the following: are equations (9) and (10) constants, or does <>_M change? ( To my understanding <>_D is constant as it comes from my the given training set)

apozas commented 7 years ago

Hmmm... I am not very convinced with any "evolution of the physical system" actually taking place (despite the fact that we might have discussed about it a few weeks ago). The process of learning, as I see it, should be composed of these steps:

1.- Compute Q(z) with the training points (this distribution will not change throughout the process). You can already compute the <>_D (the positive phase contribution) if you want, since they will indeed not change throughout the process. You can either do a sampling according to the probability distribution, or just compute _D=\sum_z A(z) Q(z). 2.- Initialize the Js and hs (i.e., give them an initial arbitrary value) 3.- Build the distribution P(z)=1/Z exp(beta E(z)), with E(z) the energy function corresponding to the Js and hs defined before. 4.- Compute the negative phase contributions <>_M (again, either by sampling or just computing it "analytically") 5.- Compute the new Js and hs according to Eqs. 9 and 10 6.- Goto step 3 until convergence (i.e., until P(z) ~ Q(z))

gorkamunoz commented 7 years ago

Yes, I had a similar idea. However, my problem comes in step 3! P(z) is a function of the couplings J and h, but also of a certain configuration of spins, that then your are sampling in step 4. How do we define this spins? Is the evolution of this spins what bothers me, as in a physical system they should change as you change the couplings between them.

apozas commented 7 years ago

Ok, let's say it this way: Let's denote the set {Js, hs} with the letter c (for couplings). Then the function P depends on both the couplings and the spins. We denote this as P_c(z). With the couplings initialized on step 2, c0, you can define the function P_c0(z), where the symbols J and h have been substituted by the corresponding numbers, so you have an expression that only depends on the variables z (your spins, but they have not taken any specific value yet).

Then, in the next step, you compute the expectation values. Say we do it by computing the expectation values directly (without sampling). Say furthermore that you want to compute the contribution to the parameter J_ij. What you would have to do is to take the function z_i * z_j, multiply it by the distribution P_c0(z) (or given that is normalized, by the corresponding marginal P_c0(z_i, z_j)), and sum over all possible values of z_i and z_j. This should give a number as output.

When you are done with computing all the contributions, you add and subtract them according to Eqs. 9 and 10. In this way, you have the new numerical parameters c1 that you will have to substitute to obtain the distribution P_c1(z) over the spin values.

As you see, there is no physical evolution of the spins or the distribution. Furthermore, I expect this to be kind of slow given that the state that you are preparing (the distribution P) is thermal with respect to the energy function, so I think we should not expect any evolution at all (if in a closed system)

El 12 may. 2017 16:14, "gorkamunoz" notifications@github.com escribió:

Yes, I had a similar idea. However, my problem comes in step 3! P(z) is a function of the couplings J and h, but also of a certain configuration of spins, that then your are sampling in step 4. How do we define this spins? Is the evolution of this spins what bothers me, as in a physical system they should change as you change the couplings between them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/peterwittek/qml-rg/issues/51#issuecomment-301088208, or mute the thread https://github.com/notifications/unsubscribe-auth/AYmgphVcP1QYtVZ7cQkoTj2gxcoggyf6ks5r5GlCgaJpZM4NUF-Z .

apozas commented 7 years ago

I guess after the discussion in the meeting this issue can be closed.

peterwittek / qml-rg

Negative phase in Boltzmann machine #51