Open jakelishman opened 1 week ago
A minor follow-up: I think a method on PyReadwriteArray
that consumes the mutable view object would be sufficient to satisfy all the safety concerns? So
impl<T, D> PyReadwriteArray<T, D> {
pub fn make_nonwriteable(self) {
// SAFETY: the array pointer is known to be non-null. We are consuming the only
// possible extant mutable view on the data, so cannot invalidate any accessible
// mutable view.
unsafe {
(*self.as_array_ptr()).flags &= !npyffi::flags::NPY_ARRAY_WRITEABLE;
}
}
}
might work? Happy to PR and add a bit of documentation around it, if so.
We have a .readonly()
method on PyArray
, does that fit your need?
I'm trying to return the ndarray
as view object to Python space, and PyReadonlyArray
doesn't (can't, I think?) implement ToPyObject
since it can't be certain the lifetime would be bound correctly. It's Python space I want to forbid from modifying the resulting ndarray
- in Numpy land that means ensuring that two flags
are unset: the OWNSDATA
flag, which borrow_from_array_bound
handles, and WRITEABLE
, which I couldn't spot a way to do in the rust-numpy
API without making this pointer dereference.
I could copy the data into a Numpy array on access, or write a custom type that implements much of the array interface I want to provide users easy access to (it's all the magic operator methods in particular, since the ufunc protocols are easy enough), I'm just trying to reduce runtime / boilerplate code.
It looks to me like PyReadonlyArray
implements Deref
to Bound<'py, PyArray>
, which does implement ToPyObject
, so that seems like a straight-up missing API?
You can presumably also do something like (*readonly_array).to_object()
to avoid the need to wait for a release.
Sorry if I'm missing some of your point here - I'm not trying to enforce immutability within Rust, I'm trying to enforce it from Python. I can't allow Python space to write into the data buffer.
I can pass underlying PyArray
that the PyReadonlyArray
is a borrow of back to Python space, but unless I unset the WRITEABLE
flag in the struct (in Numpy C API, that's PyArray_CLEARFLAGS(my_array, NPY_ARRAY_WRITEABLE)
), Numpy will allow writes to the array from Python. Derefing PyReadonlyArray
back to PyArray
doesn't set the flag (nor should it - that'd prevent any further PyReadwriteArray
s from being taken out). It's possible that PyReadonlyArray
could be given a ToPyObject
impl, but that could be misleading; returning a Rust-space immutable borrow to Python probably oughtn't to have the side-effect of flipping the underlying Python object to immutability. If a user ever called PyO3 functions passing a PyReadonlyArray
as one of their arguments, they might suddenly get runtime borrow errors when attempting to use the same array mutably later.
Give or take, in pure Python, I'm doing something like this:
import numpy
class A:
def __init__(self):
# This is my data. In Rust, it's some other slice type.
self._buffer = b"\x00" * 256
def view(self):
# This wraps the data in an `ndarray` that forbids mutations.
return numpy.frombuffer(self._buffer, dtype=numpy.uint8)
a = A()
assert not a.view().flags.writeable
# This will raise an ValueError because it's read-only.
a.view()[0] = 1
So the underlying data buffer isn't an ndarray
, but I want to use ndarray
to provide Numpy semantics on it, but only permit reading, not writing. Numpy's C API does have all the tools to do this - the Rust code I wrote up top is (if I got it right) effectively the correct way to do it, though the C-API has a helper macro. rust-numpy
is already somewhat aware of the flag, because you can't take out a PyReadwriteArray
on an ndarray
with it set. I'm looking for a nice safe abstraction to set that flag on a PyArray
.
Ah sorry, this is entirely my mistake causing the confusion. We used to set the flag, but after the borrow strategy was changed in #258 we stopped doing that.
In which case I think the API you propose above in make_nonwriteable
seems like the right approach. I guess it should return a PyReadonlyArray
?
Do we think there are any other alternatives for the name? I briefly thought about into_readonly
but it seems to me like because this does a bit more than just the Rust borrow checking it makes sense to give it a more unusual name.
Oh yeah, returning a PyReadonlyArray
is certainly fine too. I wasn't thinking about it because I stop using the array from Rust space as soon as I set this flag.
For naming, I was somewhat avoiding "readonly" because of the clash with PyReadonlyArray
(which is a different concept of readonly), and the Rust-space as_
, to_
and into_
conventions because this a side-effect-y function; it modifies the properties of what you can do with the PyArray
that the PyReadwriteArray
is borrowed from, so its effects extend beyond the lifetime of the object it consumes.
I couldn't think if there's any other conventions in rust-numpy or PyO3 for that kind of "modification beyond the borrowed lifetime" implication. I can write up a PR, and we can play with the name in that, if anything comes to mind?
Sounds good to me, thanks 👍
I'm using
PyArray::from_borrowed_array_bound
to expose an immutable data buffer from a Rust struct to Python space via Numpy. The type needs to uphold some invariants about its internal data, which means I want to prevent Numpy from allowing writes, so I'm trying to unset theWRITEABLE
flag before returning the object from Rust space.Currently, I'm doing something like this:
I'm not certain that I'm entirely correct in doing this in the first place. If I am, then my second
unsafe
block feels like it could have a safe abstraction wrapping it? If I understand correctly, if we can assert/verify that there are no existingPyReadwriteArray
s taken out on the underlying memory in Rust space (which I haven't done explicitly here, since I only just created the array object), then removing theWRITEABLE
flag should be a safe operation.Please correct me if I've made soundness errors here. If I haven't, could we potentially add methods like
nonwriteable
/try_nonwriteable
that unset this flag in a safe manner?