Open arogozhnikov opened 2 years ago
Hi @arogozhnikov, thanks for your interesting proposal. I have a few questions:
ndim
must be statically known but the sizes along each dimension (shape
) can be unknown, meaning len(x.shape) == x.ndim
always. Therefore, I fail to see why we need enforce_shape
. To me introducing it would be cumbersome, ex: what happens if x.ndim
is not equal to len([None, None, None])
?d1
to appear on both sides implies that array libraries must support symbolic operations, which I think is out of scope and also an overkill.Perhaps I missed something obvious...?
Hi @leofang
In your Example 2, allowing d1 to appear on both sides implies that array libraries must support symbolic operations, which I think is out of scope and also an overkill.
Proposal does not introduce any new requirement for any backend. Maybe you were confused by using same variable for input and output? I've just renamed input variable.
If backend is define-by-run (all shapes known), d1_input
can be a number or None. Returned axis is a number. As I mention, for non-graph frameworks, this can be implemented as a plain python function, no obstacles for them.
If backend is symbolic, d1_input
can be a number, a symbol or None. Symbol support can be marked as optional, but I expect that computations on shapes are too common to ignore the case. Returned axis is a number or symbol.
we landed on the decision that ndim must be statically known but the sizes along each dimension (shape) can be unknown
That's what I mention in the first line.
Thanks @arogozhnikov, interesting proposal. I have to admit I find it a little hard to wrap my head around - probably because I have worked with and on eager libraries a lot, and don't have concrete use cases in mind where I've needed something like this.
there are cases when axes can be split into operation-important and non-important. There is a generic path to deal with these cases by first reducing to a fixed dimensionality, and converting back afterwards. Example: last dimension should be rgb, should convert to hsl. Assuming we have a function that implements conversion for 2d array.
Let me try to unpack this a little.
x_rgb
is an N-dimensional array, we don't know what N is, and the same is (n1, n2, ..., n_N, 3)
. prod(n1, n2, ..., n_N) == n_elements
.n_elements
you introduce shape_components
(is this a typo, should it also be enforce_shape
?).n_elements
also calculable as xp.prod(xp.shape(x_rgb)[:-1])
? I think yes, and that happens to be easy here because of [..., None]
- but if that expression gets more complex, it's harder to express without enforce_shape
.If you have more real-world code example to point at, that would help me I think.
On a more philosophical level - we don't have APIs that produce arrays with unknown dimensionality, so do we actually need to be able to deal with that? I may be missing something here, but it seems to me like we should also want/need APIs that produce such arrays in order to need a function to parse those shapes?
To figure out n_elements you introduce shape_components
There are two parts: n_elements
(to combine these axes) and axes
(to reconstruct original shape, that's also important).
is this a typo, should it also be enforce_shape?
Thanks, this was a typo (changed name while writing proposal), just corrected that, now it is enforce_shape
.
Isn't n_elements also calculable as xp.prod(xp.shape(x_rgb)[:-1])
That's right, it should be calculable like that, but for unknown shapes (symbols) xp.shape
may contain None
s.
And if dimensionality is unknown (number of components in tuple), it may return None
as standard says nothing about it (and some frameworks do just return None in their current interface). This prevents reshape-containing manipulation on such tensors.
happens to be easy here because of [..., None] - but if that expression gets more complex, it's harder to express without enforce_shape
For cases I want to address here, you only need to compute prod(x.shape[start:end])
, and that's enough (indices start
and end
may be positive or negative). Again, problem is just this can't be computed with a current standard.
Additionally, currently there is no way to enforce minimal or exact dimensionality. E.g. if first axis is batch, and the last axis is channel, but between them there may be >= 0 axes, there is no way to express this requirement. Expression x.shape[1:-1]
does not fail for 1d tensors. Proposed enforce_shape
incorporates this (get necessary components, enforce minimal or exact dimensionality, bake check into graph).
On a more philosophical level - we don't have APIs that produce arrays with unknown dimensionality, so do we actually need to be able to deal with that?
You mean - do you need to deal with arbitrary shapes? I am thinking from einops perspective (probably the first adoption would come from lib devs). Einops gets tensors from user, and users define tensors in whatever framework, and can pass tensors of any dimensionality, symbols too, so that's an important piece for me.
If you have more real-world code example to point at, that would help me I think.
Some common cases:
.shape, x
that will not work for graph frameworks. assert x.ndim == 2
is meaningless for symbolic tracing.Now, to be fair, there is a number of cases that this proposal won't address. Like, if you have ndimage, and want to make subsampling / upsampling over one specific axis, this is doable with tools currently in a standard. But if one wants to do this for every axis, unfortunately I see no simple universal tool for it right now.
In other words, when axes of variable dimensionality should be manipulated, it's hard to put into symbolic tracing without specific functionality. A bit hard to do this without some kind of generic ifelse
operation or dimensionality-dependent dispatching.
and
axes
(to reconstruct original shape, that's also important).
Agreed, that matters too.
Proposed
enforce_shape
incorporates this (get necessary components, enforce minimal or exact dimensionality, bake check into graph).
For minimal dimensionality, I've never been too happy with np.atleast_2d
& co. But probably that has to do with it also using np.asarray
to coerce to ndarray. If enforce_shape
would be shorthand for:
if x.ndim < 2: # example for atleast_2d
x = reshape_to_2d_by_prepending_axes(x)
then that does seem useful.
You mean - do you need to deal with arbitrary shapes? I am thinking from einops perspective (probably the first adoption would come from lib devs). Einops gets tensors from user, and users define tensors in whatever framework, and can pass tensors of any dimensionality, symbols too, so that's an important piece for me.
Dealing with arbitrary shapes - kind of, but perhaps not exactly. You must as a library author indeed anticipate getting array inputs which can have any dimensionality. I think the difference between "arbitrary shape" and "unknown shape" here is whether input_array.ndim
is an integer or unknown.
Einsum with ellipsis can be can be implemented with this functionality for any framework.
@leofang is planning to propose a linalg.einsum
function for the standard soon, so that should have synergy here.
In other words, when axes of variable dimensionality should be manipulated, it's hard to put into symbolic tracing without specific functionality. A bit hard to do this without some kind of generic
ifelse
operation or dimensionality-dependent dispatching.
Thanks for the examples @arogozhnikov. My intuition here is that your proposed function has two goals here:
(2) is probably a superset of (1). I have to think more about the implications here, and whether the API surface added for (2) also makes sense in eager libraries, or if that's then always a "do-nothing function call".
It'd be great to get some input from maintainers of MXNet, Dask, and TensorFlow here - I think those are the main beneficiaries of (2).
If enforce_shape would be shorthand for: [reshape_to_2d_by_prepending_axes] then that does seem useful.
Hmm, didn't think about it, that's a good point. I can imagine adding one more constant like xp.OptionalAxis
:
x, [d1, d2] = xp.enforce_shape(x, [xp.OptionalAxis, None])
with a restriction that all OptionalAxis should be on the left.
My intuition here is that your proposed function has two goals here:
- Make it easier to deal with arbitrary-but-known-dimensionality inputs to array-consuming libraries,
yes
- Expand support for delayed/graph-based execution from "unknown axis size is allowed" to "unknown dimensionality is allowed",
That's correct.
Additionally: known dimensionality with unknown components still precludes a number of operations.
E.g. einops operation i j k -> i (j k)
requires knowing dimensions or having symbols. None
s returned by xp.shape
are not sufficient.
So the goal is to have a function that can address points above and deal with all scenarios (known shape, unknown components, unknown number of dimensions).
It'd be great to get some input from maintainers of MXNet, Dask, and TensorFlow here - I think those are the main beneficiaries of (2).
Completely agree. Symbolic tracing libs are the ones to benefit, but also the ones who need to make additional steps, so their input is critical.
Previous discussion: https://github.com/data-apis/array-api/issues/97 Resolution was to ignore the case of data-dependent (unknown) dimensionalities.
I have a suggestion how to implement it by adding a new function to the API with a specific interface.
This allows to support variable dimensionality in a large number of practical cases. Additionally, it allows to partially control dimension sizes that are currently unknown.
Example 1:
enforces dimensionality to be 3, will fail for inputs of other dimensionalities. If dimensionality is not yet known - will fail in runtime at this operation. Elements
d1
,d2
andd3
are integers or symbols, not Nones.Returned
x
is either the same tensor, or tensor with attached information about shape. Main reason for returning a tensor is that operation could not be cut out of the graph if resulting shape is not used to obtain result.Example 2:
expects input to have 3 dimensions, first axis matches to (maybe) symbol d1_input, last axis of length 4.
Example 3:
expects input to have 3 or more dimensions. First axis is of length 1, second of any length, last of length 3.
Note here, that for ellipsis a tuple with two objects is returned:
axes
corresponds to x.shape[2:-1], andn_elements
is a product of elements inaxes
.axes
is either a tuple ofUnion[int, Symbol]
(if dimensionality is known) or a Symbol for a tuple of unknown size (if dimensionality is unknown), same type as used to represent unknown shape.n_elements
is either int (if size of all axes is known) or symbol foraxis
.Signature
I see this as a good intermediate point: sufficient to deal with multiple practical cases while requiring framework developers to maintain very little.
Downstream package dev perspective
tf.assert(...)
and bind them in the graph, which can't be done in a cross-framework mannerFrameworks perspective
While the function is novel, it does not introduce any new entities: symbols for axes and symbols for shapes (like TensorShape) should exist anyway. Function behavior overlaps with and extends
tf.set_shape
by supporting ellipsis.For frameworks without data-dependent shapes, this is straightforward to implement based on already exposed shape property.