didriknielsen / survae_flows

Code for paper "SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows"
MIT License
283 stars 34 forks source link

Is GatedTanhUnit implementation a bit of mistaken? #15

Closed pkulwj1994 closed 3 years ago

pkulwj1994 commented 3 years ago

This experiments is so good and complete for generative modeling. When I walk through all codes in details I find the GatedTanhUnit implementation (summarized in GatedTanhUnit, detialed in gated_tanh) does not match my understanding for GatedTanhUnit.

The GateTanhUnit paper seems filter same input by two different streams and then elementwise multiply two filtered outputs. However in this repo, GatedTanhUnit seems divide (chunk) the input by certain dimension, and then filter two parts and elementwise multiply the results. The final output will be half the dimensional of input. Detailed codes are copied below

def gated_tanh(x, dim):    
    '''Gated Tanh activation.'''    
    x_tanh, x_sigmoid = torch.chunk(x, 2, dim=dim)    
    return torch.tanh(x_tanh) * torch.sigmoid(x_sigmoid)

Great Thanks to check and response!

didriknielsen commented 3 years ago

Thanks, but I think this should correspond when you put a linear layer in front (such as nn.Linear or nn.Conv2d).
This is similar to the Gated Linear Unit which is implemented in PyTorch.

pkulwj1994 commented 3 years ago

Got it! This implementation is quite similar to a non-parametric Affine Coupling Layer without output concatenation. So this kind of GLU will result in a reduction of dimension. Hope I understand correctly.