peterwittek / qml-rg

Quantum Machine Learning Reading Group @ ICFO
GNU General Public License v3.0
131 stars 63 forks source link

Transverse field in Amin et al. #35

Open apozas opened 7 years ago

apozas commented 7 years ago

When reading this week's quantum paper, I came across something I do not quite understand. In Eq. (16) they define their truly quantum Boltzmann machine Hamiltonian, which has home transverse field, and then they argue that there is one term in the derivative of the error function Eq. (18) that cannot be estimated using sampling. My question is then: does this mean that this QBM would not suppose any improvement over a QBM without the transverse field? It seems like with exact diagonalization they get improvements, so would this be more like some kind of experimental limitation?

peterwittek commented 7 years ago

From Roger:

I guess it depends what you mean by "improvement". Having an extra parameter, be it a transverse field or anything else, should always be expected to increase the "expressive power" - what I consider the ability of a neural network to represent large data sets (i.e. wavefunctions) in an efficient way. The inability to learn the parameter as part of the optimization procedure limits this, however, but regardless the network could still be more expressive with it than without it.

"Improvement" could also mean an improvement in learning; i.e. improving the stochastic gradient descent. I think that is a harder question to answer than the raw expressivity...

apozas commented 7 years ago

Thanks. My question was more regarding the first meaning of "improvement" that you described. If you cannot "read off" in an experiment (this is what I understand by "estimated using sampling") the information encoded in the parameters of the transverse field, is it really possible that we obtain more accurate predictions, or the same prediction but using less physical/time resources, than when using a classical Boltzmann Machine?

On the other hand, I'm very sure that this is useful for some sort of "simulations of dynamics", or representations of large data sets, when the information encoded in these extra parameters can be transferred to other systems from where they can be read off, but again, I don't see very well the advantage that can be obtained when encoding information in places that you cannot decode afterwards.