Why differently shaped tensors can't be concatenated with rearrange?

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

https://einops.rocks

MIT License

8.37k stars 347 forks source link

Why differently shaped tensors can't be concatenated with rearrange? #20

Open Mithrillion opened 5 years ago

Mithrillion commented 5 years ago

Currently, concatenation as in the example is done by calling stack_on_zeroth_dimension() first then rearranging the tensor into the appropriate shape. However, most backend.stack() requires that all except the stacked dimension to be the same, so simple concatenation of a dimension with different lengths is not possible.

For example, if we were to stack an image with 3 channels with an image with a single channel to create a 4-channel image:

img1 = np.random.randn(300, 200, 3)
img2 = np.random.randn(300, 200, 1)

np.concatenate([img1, img2], axis=2).shape
# (300, 200, 4) as expected

rearrange([img1, img2], 'b w h c -> w h (b c)')
# np.stack error: all input arrays must have the same shape

I would be ideal if such cases occurs, concatenation methods like np.concatenate or torch.cat is called instead of stack. I am not sure how this might break the simplicity of the rest of the code.

arogozhnikov commented 5 years ago

Hi, @Mithrillion, and thanks for taking time to report.

It indeed may look like a bug, but demanding same shape of arguments is how it should work to make einops uniform.

Some examples to explain what I mean.

# suppose this works like concatenation, so it concatenates along the channel
x = rearrange([img1, img2], 'b w h c -> w h (b c)')
# this looks like the inverse, but it returns two images of same shape
img1, img2 = rearrange(x, 'w h (b c) -> b w h c', b=2)

# if this is concatenation
x = rearrange([img1, img2], 'b w h c -> w h (b c)')
# this seems to make no sense at all
x = rearrange([img1, img2], 'b w h c -> w h (c b)')
# even harder to understand what could this mean. What are requirements for c,b,h?
x = rearrange([img1, img2], 'b w h c -> w (c b h)')

So, concatenation is meant only to work with arrays of similar shape (at least, until the way to avoid inconsistencies is found).

I'll leave this issue as open for reference so others could find it easily.

simonalford42 commented 2 years ago

What if there were a concatenate operation with syntax something like:

concatenate([img2, img2], 'x w h c, y w h c -> (x + y) w h c')

I would find this very satisfying to use.

arogozhnikov commented 2 years ago

Hi @simonalford42 , see this comment https://github.com/arogozhnikov/einops/issues/56#issuecomment-962584525 There is no ETA for this feature

arogozhnikov commented 1 year ago

Update regarding concatenation: einops just got pack for better concatenation and unpack for times better splits