Open mdhaber opened 3 months ago
Has this come up in real-world code? I can definitely see wanting to write functions that work on both real and complex dtypes. But I also am curious how often a complex implementation can be directly reused for a real implementation.
Yes. I have a complex number that is the logarithm of a negative real number. I take the real component to get the magnitude.
Wouldn't real(log(x))
return nan
if x
is a negative real?
It would if x
had a real dtype. Let's expand the definition of "real" to include complex dtype with zero imaginary component. Another example is the norm of a real or complex vector implemented in terms of the dot product of the vector and its conjugate.
There must be a misunderstanding here. real
is already supposed to work on any array with a complex64 or complex128 dtype, regardless of what the value of the imaginary component is. The above discussion was specifically about whether it should work on float32 and float64 arrays.
Another example is the norm of a real or complex vector implemented in terms of the dot product of the vector and its conjugate.
This is a better example I think. x @ conj(y)
would just work and give the right thing for real x
and y
if conj(y)
was a no-op.
There must be a misunderstanding here.
It seems so. But all that's important is that in my application, there are real and complex dtypes going through the same code, and real
is needed for the complex dtypes, so it would be convenient for real
to work on both real and complex dtypes. So it is relevant to this discussion.
This is a better example I think.
Sure, it's simpler. I mentioned the original example because it was the most recent time I've wanted real
to work for both real and complex types, and it's more interesting, suggesting that there may be other more exotic - but also valid - use cases out there.
It definitely make sense to me for conj
. I guess you could argue real
and imag
should probably just work the same.
Your example real(log(x))
doesn't make sense to me, because you would have to either do real(log(astype(x, complex128)))
or else branch like
if isdtype(x.dtype, 'real floating'):
return log(abs(x))
else:
return real(log(x))
but maybe there are better examples for real
and imag
. If you're using any function that maps real numbers to non-real numbers (like log
, sqrt
, etc.), then you have to do something like this, because on float
s they will just return nan
, but if you aren't, then real numbers should just pass through fine.
For real
, imag
, and conj
, I am not sure the real benefit apart from saving some character strokes. If your code needs to support real and complex dtypes, one can explicitly cast by doing
re = xp.real(xp.astype(x, dtype='complex128', copy=False))
where copy=False
ensures that if the dtype matches, the operation is a no-op. Graph-based libraries would also be able to determine whether real(astype(<real>))
is necessary. Only eagerly executed libraries such as NumPy may perform an extra copy by casting real to complex.
It would be useful to see an explicit use case beyond a toy example of where sharing the same code path for real and complex is necessary and performance critical.
Your example real(log(x)) doesn't make sense to me
I haven't tried to explain that example in complete detail. If everyone is interested, we can review the code, but I thought the basic outline would suffice.
For
real
,imag
, andconj
, I am not sure the real benefit apart from saving some character strokes
Yeah, I think that's the only benefit I'm suggesting. I don't have any examples where the extra decision (real vs complex) is performance critical. It is only a line of ternary and a few microseconds. It just seemed that the motivation for disallowing real
with real input was the idea that unintended uses were more common than legitimate uses. I didn't see data about unintended or legitimate uses - it seemed to be just notional - so I thought I'd add some data about legitimate uses.
If the decision here is to leave real
/imag
/conj
unchanged, that's fine; I've already worked around it. Just trying to help improve things based on my experience converting NumPy code to the array API. I'm doing a lot of it, and there are a lot of little things I notice. I'm only reporting the things I think are most important.
But then I'm curious - why not raise an error when one tries to round
or perform a similar no-op on an integer dtype? That is, what is the difference that makes it appropriate to raise an error for real
/image
/conj
with real argument but inappropriate to raise an error for round
/floor
/isinf
/etc. with integer argument?
I think the conjugating dot product is a legitimate enough use-case. You can have an expression that's exactly the same for complex and real inputs except for the conjugation, which is actually a no-op on real numbers. So you'd have to have something like
if isdtype(x.dtype, 'complex floating'):
<expression with conj>
else:
<same expression without conj>
Or do some trick like conj = positive
. But if conj(x)
just returned x
for non-complex dtypes you could just use the single expression without the branching.
Also note that casting to complex is the wrong solution here. That does have a real performance impact. Not only do complex arrays take up more memory, but a complex algorithm on an array with 0 imaginary part will be much slower than a real algorithm on an equivalent float array. For example:
In [1]: import numpy as np
In [2]: a = np.linspace(0, 10, 1000)
In [3]: b = a + 0j
In [4]: %timeit np.sqrt(a)
450 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [5]: %timeit np.sqrt(b)
3.88 µs ± 8.78 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
This is only the correct thing to do in cases where you don't get the same expression for real inputs, like the log
example (and even for that example, branching and doing log(abs(x))
instead is probably better).
Thanks @asmeurer.
I think another thing worth mentioning is that a lot of array API standard code is translation from existing code. For instance, I brought this up when translating a file with NumPy code that has 11 uses of np.real
or <array>.real
. In at least some of those cases, the dtype can be real or complex, and apparently the code works. So when I am making hundreds of other adjustments for the array API standard, I don't want to:
.real
individally to determine which ones can potentially have real input, using ternary logic only where necessary (possibly getting it wrong and introducing bugs), or.real
(unwieldy). For simplicity, I wrapped real
so that I have the ternary logic but don't have to write it out a dozen times.
But all the little changes can add up to a lot of developer time, so I thought revisiting this might save others some effort in the future. It could indeed be the other way - that by being stricter than the existing array libraries, this prevents bugs - but if we don't have data on that, and we have some evidence of legitimate use cases, it might be OK to follow the precedent set by the libraries.
Another thing to consider is that astype() is explicitly disallowed from casting complex inputs to real inputs (see note 2 at https://data-apis.org/array-api/latest/API_specification/generated/array_api.astype.html#astype). So real()
should be considered the canonical way to get the real part of a complex array. One could then argue that, like astype
, it should function as a no-op for arrays that are already real.
There is also a question of imag(). imag on a real array would just be an array of zeros, which would make it not completely cheap (I suppose it could be done as a broadcasted scalar, but that would lead to unexpected behavior under mutation). On the other hand, I think it makes more sense to keep imag consistent with real and conj.
By the way, just so we're clear, are you also suggesting that these functions should operate on integer arrays, or only on real floating-point arrays?
I have also encountered the usability issue with conj()
a few times already (also x @ conj(y.T
indeed) - being able to write generic code across real and complex dtypes is nicer indeed, we should make this change in my opinion.
For real()
I also agree with the arguments given - it's easier to write generic code, and it saves developer time. It's possible that the user wrote code that isn't supposed to work with real inputs, but I haven't seen evidence of this and I don't consider it a risk high enough that it should be prevented by design.
imag()
is more tricky. The usages I looked at so far at were either handling only complex code, or were doing not-so-useful things for real inputs (e.g., explicitly comparing x.imag
with an array of zeros in tests), or were better rewritten with an explicit if isdtype(x.dtype, 'complex floating'): ... else: ...
.
There is also a question of imag(). imag on a real array would just be an array of zeros, which would make it not completely cheap (I suppose it could be done as a broadcasted scalar, but that would lead to unexpected behavior under mutation).
This is the problem indeed. It's bad API for a function like np.imag
to return a view on existing data for complex inputs, and create a new array for real dtypes. It's both a non-obvious performance footgun and a potential issue with the semantics. When you're dealing with large arrays, having something go from O(1) performance to allocating a lot of memory is not good. In the majority of cases, you want to skip the computation completely for real input, or do the relevant computations with a size-1 array or Python scalar instead.
In the (relatively rare I think) cases where this is the desired semantics:
if isdtype(x.dtype, 'complex floating'):
imag_component = x.imag
else:
imag_component = xp.zeros_like(x)
it seems better to let the user write that out explicitly.
For completeness, performance:
>>> x = np.random.default_rng().standard_normal(10_000_000)
>>> x.dtype
dtype('float64')
>>> xc = x.astype(np.complex128)
>>> %timeit xc.imag
50.3 ns ± 0.154 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
>>> %timeit x.imag
4.43 ms ± 74 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
and create a new array for real dtypes
Would it have to, or could it be implemented as a view of a single zero element? It should not be writeable anyway, since the original array would still be of real dtypes, so the imaginary part could not be anything other than zero.
Would it have to, or could it be implemented as a view of a single zero element?
In principle it could be aliased memory from broadcasting a size-1 array indeed. However, (a) NumPy isn't going to make that change I think, and (b) that still has mismatching semantics where imag(x_real)
would not be writeable or have incorrect semantics under mutation for libraries that don't have read-only arrays, while imag(x_complex)
is writeable. Switching semantics based on the dtype of the input array is also not good.
It should not be writeable anyway, since the original array would still be of real dtypes, so the imaginary part could not be anything other than zero.
Currently the returned array is writeable, so there'd be a backwards compat concern for NumPy et al. too.
Currently the returned array is writeable, so there'd be a backwards compat concern for NumPy et al. too.
Well, in CuPy and Dask array, it looks like. Not NumPy, Torch (doesn't allow imag
on real arrays), Tensorflow, or JAX.
import numpy as xp # 1.26 or 2.0
x = xp.asarray([1., 2., 3.])
y = xp.imag(x)
y[1] = 10
# ValueError: assignment destination is read-only
NumPy isn't going to make that change I think
The standard doesn't specify whether libraries should return a view or copy - does it ever specify writability? If not, NumPy wouldn't have to change to be compatible with the standard. It just might not be the preferred implementation of the standard. (If you want to keep the two consistent, perhaps neither would be writable. But personally, I don't think it would be so surprising for the imaginary part of a complex array to be writable while the imaginary part of a real array is not. After all, a complex value cannot be assigned to an element of a real array.)
That said, I haven't yet identified a need for imag
of a real array, so I would not let concerns over that hold up real
of a real array, personally.
The latest version of torch does allow real
and conj
but not imag
on real arrays:
>>> torch.real(torch.tensor([1.]))
tensor([1.])
>>> torch.conj(torch.tensor([1.]))
tensor([1.])
>>> torch.imag(torch.tensor([1.]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: imag is not implemented for tensors with non-complex dtypes.
We agreed at the meeting today that it makes sense to allow real dtype inputs to real
and conj
, but to leave imag
as only specified for complex inputs.
FWIW generally agree and am sympathetic to the arguments Matt makes here. Also agree with Ralf that real
and conj
make the most sense
and create a new array for real dtypes
Would it have to, or could it be implemented as a view of a single zero element? It should not be writeable anyway, since the original array would still be of real dtypes, so the imaginary part could not be anything other than zero.
While this is possible, have seen NumPy's masked array also take this approach (False
scalar as shorthand for an array full of False
as the mask). As a past user of it, have found some of the resulting code around this to be surprising. When does it choose to expand the mask? What happens when working with views/slices of the data (can one affect masking there and have it propagate back)? Not always consistent and clear to the user. So would caution this approach
gh-427 summarizes the rationale for requiring that inputs to
real
,imag
, andconj
be complex-valued:https://github.com/data-apis/array-api/pull/446#discussion_r991577991 states that this is open to being revisited.
If the concern is ambiguity between intended and unintended use, was there a similar discussion for functions that are no-ops for integers, such as
round
andisinf
? It looks like the initial commit that added rounding functions already included the special case for integers, but I couldn't find a discussion about it.Perhaps it didn't come up because it is common for code paths that work on floating point dtypes to also be used for integer dtypes. In the same way, I'd suggest that code for complex dtypes can also be valid for real dtypes. If it is not easy to tell whether unintended or intended uses of these functions on real data is more common in the wild, perhaps the precedent set by NumPy and other libraries (e.g. CuPy, JAX, PyTorch, Dask Array, Tensorflow) is not wrong, and these functions can allow real input.