einops and tensorboard graphs

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

https://einops.rocks

MIT License

8.5k stars 350 forks source link

einops and tensorboard graphs #107

Open JBOE22175 opened 3 years ago

JBOE22175 commented 3 years ago

First of all: I like einops - readability in NN models is very much improved!

a) Using einops as operation will create nearly unreadable graphs in tensorboard. To avoid this you should use einops as layers. You should add this to your documentation and perhaps cover layers more prominently.

b) using "parse_shape" is also a nice function but it creates a warning when used with tensorboard addgraph() : "RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results)."_

Platform: windows with pytorch, tensorboard 2.2.0

see attached jupyter notebook

einops_tensorboard.zip

JBOE22175 commented 3 years ago

In the meantime I realized that repeat is not available as Layer. My proposal is to provide the function repeat as a Layer as well.

My workaround is to define a simple wrapper as layer myself. Important is that parameters are set when calling forward and not when instanciating. E.g. batch size is only available during layer execution.

class Repeat(nn.Module):
    def __init__(self):
        super().__init__()   

    def forward(self,x,pattern, **params):
        return repeat(x,pattern,**params)
repeat_layer = Repeat()
repeat_layer(inp,'n d -> b n d', b=2)

By the way: Same works for einsum which has the same problem with tensorboard graphs

class Einsum(nn.Module):
    def __init__(self):
        super().__init__()       

    def forward(self,pattern, x, y):
        return einsum(pattern, x,y)

einsum_layer = Einsum()
einsum_layer('b i d, b j d -> b i j', q, k)

arogozhnikov commented 3 years ago

Hi @JBOE22175, thanks for very detailed description of issue and the notebook.

layers that take pattern as a part of forward - this doesn't look like "designed" usage for layers, but if it works for the purpose, that's fine

I also don't see tensorboard to show actual parameters of layers, that's bummer

Repeat as layer - sure, we can add it as a layer, did not realize previously this helps with TB readability. Open as issue if you think that's the solution you want to use
"parse_shape" is also a nice function but it creates a warning when used with tensorboard add_graph() Yes, that's inavoidable with tracing. Should not cause real problems as after tracing your shapes are assumed to be fixed

JBOE22175 commented 3 years ago

Hi, thanks for your answer. I will create an issue to add repeat as a layer.

Conserning the usage of pattern and parameter for repeat: My use case is among others to translate repeat in the perceiver-pytorch-implementation by Phil Wang: https://github.com/lucidrains/perceiver-pytorch/blob/main/perceiver_pytorch/perceiver_pytorch.py. Case 1: Here h = heads is known when calling init:

class Attention(nn.Module):
   def forward(self, x, context = None, mask = None):
      mask = repeat(mask, 'b j -> (b h) () j', h = h)

Case 2: Here b = batch_size ist not known when calling init:

class Perceiver(nn.Module):
   def forward(self, data, mask = None):
      x = repeat(self.latents, 'n d -> b n d', b = b)

In both cases the pattern should be provided in init. But the parameter b in case 2 must be added in forward. What do you think how to design the interface?

xingdi1990 commented 2 years ago

I meet the same issue in case 2