data-apis / consortium-feedback

A repository for discussions related and for giving feedback on the Consortium
23 stars 0 forks source link

[RFC] Adopt einops as an API for shape-changing operations #3

Open arogozhnikov opened 4 years ago

arogozhnikov commented 4 years ago

Changing several core elements of array manipulations allows writing clean and reliable code AND resolves many conflicts.

Einops is a minimalist notation for tensor manipulations that replaces following operations:

For example, transposition

x.transpose(0, 2, 3, 1)

In einops is written as

rearrange(x, 'b c h w -> b h w c')  # long names are also allowed

There are only three operations, please read tutorial.

Motivation

While this looks like a big leap, it is a surprisingly good moment to make it: move to more reliable and transferable code without depreciating parts of existing API. In this scenario first transition would be made by libs' developers, which will simplify following wider adoption.

In case of positive decision, introduction of new features and operations in einops will be first coordinated with/by consortium to ensure identical and efficient support by all frameworks.

rgommers commented 4 years ago

Thanks @arogozhnikov for this proposal! einops is pretty cool, it does give a more natural and less error-prone syntax for shape manipulations in many cases.

A question of scope: you just proposed rearrange, but looking at the implementation that's implemented in terms of reduce. I assume that if libraries want to adopt rearrange, then reduce may live under the hood and get surfaced later? It's a pretty natural extension. Or would you merge those two functions into one?

solves the problem of interface conflicts that data-apis tries to solve (only for some operations, but covered subset of operations is probably the main source of discrepancies)

I wouldn't say the main source, but certainly an important source of issues.

While this looks like a big leap, it is a surprisingly good moment to make it: move to more reliable and transferable code without depreciating parts of existing API. In this scenario first transition would be made by libs' developers, which will simplify following wider adoption.

This does indeed seem like a good time to consider rearrange. We don't plan to propose deprecating any of the functions that rearrange replaces, but we will have to deal with some of the annoying discrepancies you point out. I can imagine we would leave a function like repeat out of the API standard, because it's not used all that much plus it's inconsistent between libraries.

In case of positive decision, introduction of new features and operations in einops will be first coordinated with/by consortium to ensure identical and efficient support by all frameworks.

This coordination is indeed something that we'd like to see happen in this Consortium. We cannot standardize something that isn't adopted by libraries yet, but we can review rearrange here, ensure the syntax and semantics of it will work well for all libraries, and then recommend adoption. And then if all libraries agree that's a good idea, move it into the API standard itself.

We haven't quite worked out the formal process for this, but we are about to write up and discuss the proposal for evolution of the standard. I imagine in the case of rearrange it should get a status like "Provisional", and then it can move to "Final" in a future version of the standard. With the criteria for the move including libraries having implemented it (and possibly having it in a released version).

This issue is probably a good place to take a "temperature reading" from maintainers of the various array libraries. No one can give a definitive verdict probably, but an indication is good enough. With my NumPy hat on: I suspect NumPy will be interested - rearrange goes quite naturally with einsum, and that's a function most people quite like.

arogozhnikov commented 4 years ago

Thanks for comments @rgommers

This issue is probably a good place to take a "temperature reading" from maintainers of the various array libraries. No one can give a definitive verdict probably, but an indication is good enough. With my NumPy hat on: I suspect NumPy will be interested - rearrange goes quite naturally with einsum, and that's a function most people quite like.

Great to hear that. Agree about "temperature reading" - opinions and questions are welcome.

A question of scope: you just proposed rearrange, but looking at the implementation that's implemented in terms of reduce. I assume that if libraries want to adopt rearrange, then reduce may live under the hood and get surfaced later? It's a pretty natural extension.

Correct, reduce may be revealed later. One can think of rearrange as reduce but without any axes reduced. It's a very cheap abstraction (rearrange just skips one call), so incorporating code as-is should be fine.

Open to standardizing all three functions, but discussing in scope of one operation may be easier.

Or would you merge those two functions into one

rearrange and reduce could be one function - but two functions in interface make user's intention more readable, so I'd keep both as it is now.

This does indeed seem like a good time to consider rearrange. We don't plan to propose deprecating any of the functions that rearrange replaces, but we will have to deal with some of the annoying discrepancies you point out. I can imagine we would leave a function like repeat out of the API standard, because it's not used all that much plus it's inconsistent between libraries.

Maybe I phrased that bad, but that's exactly my thought - we don't need to deprecate existing functions, but can still have aligned APIs

This coordination is indeed something that we'd like to see happen in this Consortium. We cannot standardize something that isn't adopted by libraries yet, but we can review rearrange here, ensure the syntax and semantics of it will work well for all libraries, and then recommend adoption. And then if all libraries agree that's a good idea, move it into the API standard itself.

We haven't quite worked out the formal process for this, but we are about to write up and discuss the proposal for evolution of the standard. I imagine in the case of rearrange it should get a status like "Provisional", and then it can move to "Final" in a future version of the standard. With the criteria for the move including libraries having implemented it (and possibly having it in a released version).

That sounds like the right sequence of steps. Let me know when there is more clarity about processes what is my part/how can I help.