Open alanhdu opened 1 year ago
Seems Python's bytes()
checks for the object being an integer (obj.__index__()
) before trying to convert it as a bytes-like object in the bytes()
constructor.
So, there is a bit of Python involvement there due to bytes()
being heavily overloaded.
However, bytes()
first checks for the __bytes__()
dunder-method. So, NumPy could implement __bytes__()
to ensure the .tobytes()
meaning here.
The main weird thing may be what to do about np.int64(0)
(the scalar) if we do that.
Just pointing out that bytearray(arr)
follows the same output as bytes(arr)
. But, I don't believe bytesarray
relies on __bytes__
like bytes
does (see). If that's right, changing __bytes__
would create some inconsistency between the two.
I would say that this is expected behaviour, not a bug.
Since both numpy scalars and numpy 0d array have a memoryview
, the correct invariants are written as
>>> arr = np.array(0)
>>> scal = np.int64(1)
>>> assert bytes(memoryview(arr)) == arr.tobytes()
>>> assert bytes(memoryview(scal)) == scal.tobytes()
On the contrary, python integers do not have a memoryview:
>>> memoryview(0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: memoryview: a bytes-like object is required, not 'int'
>>> memoryview(arr.item())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: memoryview: a bytes-like object is required, not 'int'
>>> memoryview(scal.item())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: memoryview: a bytes-like object is required, not 'int'
Given the current semantics of bytes
and bytearray
, it would be terribly wrong to try to fix this from the numpy side.
Maybe one could argue with the Python devs that if an object exposes the buffer protocol, this should take precedence over the “number” meaning... but currently it works the opposite way.
Describe the issue:
Perhaps this is a false assumption on my part, but I assumed that given a NumPy array that
arr.tobytes()
andbytes(arr)
would alwyas return the same thing, but this does not seem to be true for 0-dimensional arrays. In this case,bytes(arr)
returns an empty bytestring, whilearr.tobytes()
returns the element casted as a byte. I personally find the latter behavior more intuitive, but I think they should probably be consistent.Reproduce the code example:
Error message:
NumPy/Python version information:
Context for the issue:
No response