Closed jungerm2 closed 5 months ago
Then this assignment silently fails because obj.mask return a copy of the data. As I understand it, it's not possible to return a mutable array as both to_pyarray and into_pyarray consume/clone the data. Please correct me if this is wrong. Thus, the next best thing is to return a readonly array so that it'll throw an error if the user does the above.
If you want to mutably share arrays with Python code, I think you would need to store Py<PyArray2<bool>>
in your #[pyclass]
so that #[get]
can just return a Py<PyArray2<bool>>
by cloning, i.e. by incrementing the reference count. Of course, this implies that when you want to have Rust code modify this array, you need derive an ArrayView
into the NumPy array.
This works fine, but seems like it should be a part of the library? Like if I write to_pyarray().readonly() the resulting array in python should be readonly too?
I think this is a slightly unhappy naming clash, PyReadonlyArray
and PyReadwriteArray
are short-lived wrapper types which implement our Rust-side borrow checking to avoid Rust code creating unsound references into the interior of NumPy arrays, c.f. https://docs.rs/numpy/latest/numpy/borrow/index.html. They are not intended to modify the underlying NumPy array, but rather to avoid the unsafe
block you use in your setter by providing a safe path towards as_array
.
Thanks for the quick reply, and for clearing up this confusion.
I'll consider embedding a Py<PyArray2<bool>>
in my class instead, but I'm not yet sure what consequences that will imply. I intend to be able to instantiate this struct from within rust, without any python dependencies, as part of a package. So maybe this is a non starter for me...
As far as the unsafe
code, is there a way to avoid that here? I just noticed there's a to_owned_array
method, but it's doing exactly like above. I'm not sure I understand why PyArray2
-> ArrayBase
would need to ever be unsafe, they both hold size info, so presumably there's bound checks and all?
I intend to be able to instantiate this struct from within rust, without any python dependencies, as part of a package. So maybe this is a non starter for me...
If it is a #[pyclass]
unconditionally, you will always depend on at least libpython3.so
. Otherwise, i.e. if the type can be used without the #[pyclass]
annotation, this would tie you to depending on Python with a working NumPy installation. You might need to have a separate layer for deep integration with Python with is built from the pure Rust data structure.
As far as the unsafe code, is there a way to avoid that here? I just noticed there's a to_owned_array method, but it's doing exactly like above. I'm not sure I understand why PyArray2 -> ArrayBase would need to ever be unsafe, they both hold size info, so presumably there's bound checks and all?
There is more to memory safety than bounds checking. Most importantly, since ArrayView
basically hands out shared and mutable references to the array elements, the usual Rust rules apply, i.e. many shared references but not mutable one or a single mutable reference. We have dynamic borrow checking which ensures that e.g. you only have one live ArrayViewMut
in your extension for every NumPy array.
In this particular case, to_owned_array
might do the same, but it can be safe because the short-lived references it used never escape the function so that there can be no aliasing issues, i.e. I would suggest to use it instead of the incantation.
It looks like I'll need to give my project layout some more thought...
I'll go ahead and close this issue as the original readonly question was answered. Thanks again!
I have a pyclass that contains an
Option<Array2>
, and I would like to be able to get/set it from python. The macros#[pyo3(get, set)]
do not work for this so I ended up writing a getter/setter like so:Which works fine, but there's a subtle bug. If a user does something that would usually call
__setitem__
, such as:Then this assignment silently fails because
obj.mask
return a copy of the data. As I understand it, it's not possible to return a mutable array as bothto_pyarray
andinto_pyarray
consume/clone the data. Please correct me if this is wrong. Thus, the next best thing is to return a readonly array so that it'll throw an error if the user does the above.However, returning a
PyReadonlyArray
get's converted to a normal numpy array and the user can still erroneously set data. Here's my workaround, which calls numpy's setflags such that the returned array is not writable.This works fine, but seems like it should be a part of the library? Like if I write
to_pyarray().readonly()
the resulting array in python should be readonly too?