wilsonmr / anvil

Repository containing code related to flow based generative model
https://wilsonmr.github.io/anvil/
GNU General Public License v3.0
0 stars 2 forks source link

Stereographic projection #39

Closed jmarshrossney closed 4 years ago

jmarshrossney commented 4 years ago

Stereographic projection wrapping Real NVP for O(2) and O(3) sigma models. Replaces #35

Fairly minimal - no great attempt to make things amenable to generalisation, yet. Intending to tidy a few bits and pieces up when implementing spline flows, since it will require more thought anyway.

I've slightly messed around with the training script so that it also plots the base and model output distribution for a large sample, so we have an easy way to visually inspect what the model is doing. It's very clearly tacked on though so feel free to suggest improvements!

wilsonmr commented 4 years ago

I think the training and any plotting should be independent. I see there is an appeal in being able to plot some things every time but I think for example if you retrain this overwrites the set of plots which is undesirable. If we want to be able to quickly plot this thing then I think we should just add another example runcard which just produces a single plot - no report

jmarshrossney commented 4 years ago

Ok, will replace with a runcard that does the same thing but through anvil-sample

jmarshrossney commented 4 years ago

I've changed some cases of lattice_size to config_size so things work for O(3) with the two-dimensional projection method.

wilsonmr commented 4 years ago

sounds good.

For my own benefit is the stereopgraphic projection the only way we have of training O(3)? What happened the the naive tan/tanh functions? Maybe I missed something

jmarshrossney commented 4 years ago

The tan/tanh are great for fields with one component, although tan was more stable - this is stereographic projection in 1D.

For the O(3) fields I have tried a direct generalisation of this (the whole thing through through a tan), which leads to fields having toroidal topology. This isn't what we want - for instance it doesn't lead to fields uniformly distributed on a sphere in the T->inf limit. So what is being implemented here is the actual generalisation of stereographic projection to 2D, and it does map a spherical uniform distribution to itself at high temperatures.

Unfortunately it seems to be crap at intermediate and low temperatures, so the next thing to do is adopt the 'recursive method' from the February paper.

wilsonmr commented 4 years ago

ah ok, thanks

jmarshrossney commented 4 years ago

hmm actually sorry, what I just said isn't entirely true...I'm rusty. Applying tan/tanh to both angles doesn't mean the fields have toroidal topology - that depends on the base distribution.

I think everything through a tan/tanh probably can work for O(3). I think the reason why it wasn't considered in the MIT papers and why it didn't work as well for me is that the topology of the fields is not "built in" - i.e. the probability density varying smoothly under O(3) rotations. So it probably still works but the model has to learn these additional constraints, and it's preferable to build them in a priori.

Sorry for the confusion.

jmarshrossney commented 4 years ago

I've just made it so we can plot the base, model and target distributions using anvil-sample. There's a runcard for just this action so we can use it when we've not yet implemented observables.

I've also changed the projection action so it no longer takes the real_nvp action as a parameter.

jmarshrossney commented 4 years ago

Just fixed a typo in the distributions.yml example runcard, so now it should work. I've also added the semicircle distribution just for interest's sake.

wilsonmr commented 4 years ago

So as you might be able to see, I haven't made the generic input yet because I was making sure I understood the ProjectSphere before splitting it up and it took me a little while because theres a lot of gymnastics with dimensions and in the end it looks like it does the right thing

I have just made it so that the nvp can accept something like (n_batch, *, n_lattice) and so in theory one could change the distributions to be like (n_batch, n_dim, n_lattice) and I think that you could get rid of a lot of the view business which I find very confusing.

In the end I think we can add the flexibility later and possibly getting rid of view everywhere might be done later too as it will require a non-negligible amount of time to sort out

jmarshrossney commented 4 years ago

Ok sure. I don't think we should worry too much about the 2d projection - if it's a bit messy to reshape the tensor a few times - because it's not working too well and the other approaches to normalising flows on spheres (as far as I've read) deal with one set of angles at a time.

That said, I definitely support the idea of building everything so it can take an extra dimension.