pyro-ppl / pyro

Deep universal probabilistic programming with Python and PyTorch
http://pyro.ai
Apache License 2.0
8.57k stars 986 forks source link

[discussion] Shouldn't need cloning for Skip Layers? #3154

Closed maulberto3 closed 1 year ago

maulberto3 commented 2 years ago

Issue Description

Regarding the forward method of the ConditionalAutoRegressiveNN class, I feel that x needs cloning before doing the skip layers. Attached line: https://github.com/pyro-ppl/pyro/blob/8b7e5641228b664b3c9b674a079e95a478f3e2a0/pyro/nn/auto_reg_nn.py#L262

That's it.

martinjankowiak commented 2 years ago

@maulberto3 why do you think so? there are no in-place ops. looks fine to me.

maulberto3 commented 2 years ago

Hi @martinjankowiak My understanding is that skip connections pass the inputs as they are to subsequent layers in order to prevent input signal lost in those layers. So, If you don't clone x, h will reference x, then, operations over h will reflect also on x. So, the skip layer in the following line will add a processed x, thus not as it was. Sorry if I'm mistaken.

martinjankowiak commented 2 years ago

@maulberto3 afaik as i know there are multiple skip topologies that people use, including the one you describe. @stefanwebb can probably comment further

fritzo commented 1 year ago

@maulberto3 you would be correct that if the code used in-place operations, a clone would be needed

def _forward(self, x):
    h = x.clone()                       # .clone() would be necessary
    for layer in self.layers[:-1]:
        h[...] = self.f(layer(h))       # if we operated in-place here
    h[...] = self.layers[-1](h)         # and here
    if self.skip_layer is not None:
        h = h + self.skip_layer(x)

however the current implementation does not use in-place operations, rather the local Python variable h is replaced with effectively a pointer to a new tensor, with the original tensor x modified. This Python behavior is a little different from some langauges where h = ... replaces the data rather than the pointer.

tl;dr thanks for your attention to detail, but this case appears safe as-is 👍