Open JBOE22175 opened 3 years ago
In the meantime I realized that repeat is not available as Layer. My proposal is to provide the function repeat as a Layer as well.
My workaround is to define a simple wrapper as layer myself. Important is that parameters are set when calling forward and not when instanciating. E.g. batch size is only available during layer execution.
class Repeat(nn.Module):
def __init__(self):
super().__init__()
def forward(self,x,pattern, **params):
return repeat(x,pattern,**params)
repeat_layer = Repeat()
repeat_layer(inp,'n d -> b n d', b=2)
By the way: Same works for einsum which has the same problem with tensorboard graphs
class Einsum(nn.Module):
def __init__(self):
super().__init__()
def forward(self,pattern, x, y):
return einsum(pattern, x,y)
einsum_layer = Einsum()
einsum_layer('b i d, b j d -> b i j', q, k)
Hi @JBOE22175, thanks for very detailed description of issue and the notebook.
I also don't see tensorboard to show actual parameters of layers, that's bummer
Repeat as layer - sure, we can add it as a layer, did not realize previously this helps with TB readability. Open as issue if you think that's the solution you want to use
"parse_shape" is also a nice function but it creates a warning when used with tensorboard add_graph() Yes, that's inavoidable with tracing. Should not cause real problems as after tracing your shapes are assumed to be fixed
Hi, thanks for your answer. I will create an issue to add repeat as a layer.
Conserning the usage of pattern and parameter for repeat: My use case is among others to translate repeat in the perceiver-pytorch-implementation by Phil Wang: https://github.com/lucidrains/perceiver-pytorch/blob/main/perceiver_pytorch/perceiver_pytorch.py. Case 1: Here h = heads is known when calling init:
class Attention(nn.Module):
def forward(self, x, context = None, mask = None):
mask = repeat(mask, 'b j -> (b h) () j', h = h)
Case 2: Here b = batch_size ist not known when calling init:
class Perceiver(nn.Module):
def forward(self, data, mask = None):
x = repeat(self.latents, 'n d -> b n d', b = b)
In both cases the pattern should be provided in init. But the parameter b in case 2 must be added in forward. What do you think how to design the interface?
I meet the same issue in case 2
First of all: I like einops - readability in NN models is very much improved!
a) Using einops as operation will create nearly unreadable graphs in tensorboard. To avoid this you should use einops as layers. You should add this to your documentation and perhaps cover layers more prominently.
b) using "parse_shape" is also a nice function but it creates a warning when used with tensorboard addgraph() : "RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results)."_
Platform: windows with pytorch, tensorboard 2.2.0
see attached jupyter notebook
einops_tensorboard.zip