Closed wellecks closed 7 years ago
How do we resolve the conflict? Should I try to integrate your Importance Sampling stuff into my refactoring? Not sure how all this works...
I can take care of the merge conflict - you'd have to merge it manually w/ the command line and resolve the conflicts
ok i'll leave it to you
Also, this comment got collapsed above:
(_get_made_mask
in utils.py)
Did you check whether this is equivalent to the old code? I don't see the output mask (MV in the old code), which I think is necessary (equation 9 from MADE paper)
it's not equivalent but i don't think the output mask is necessary (or correct).
The way I see it, equation 9 of the MADE paper is for the situation where we are modeling the whole autoencoder x --> z --> x_rec, with a single layer of parameters in each step. So there is a hidden mask and an output mask.
This is different to our situation. Here we are using MADE to map z0 --> z1 --> ... --> zk, with a layer of parameters in each step. It doesn't relate to the mapping x --> z0, nor to zk --> x_rec. So we need a mask for each layer of IAF. I created something (in neural_networks.py) called made_layer to do just that: given input, it applies a masked weight matrix so that output k only depends on the first k-1 inputs. However I am really unsure about all this, and like I said before, my one isn't working.
The original IAF is maintained in vae_iaf.py, and I haven't touched it.
I see. I was interpreting MADE as a function f(x) -> x', and in our case we are applying it to z, i.e. f(z0) -> zk. The function happens to be an autoencoder, and the masks that it uses internally ensure that the function satisfies the autoregressive property.
But yeah I'm also not 100%, and I can see your point about it being odd to use the full reconstruction process rather than just the mapping from input to hidden layer
Anyways, awesome work on the refactor! I'll merge in the changes
Creating this so I can add comments below