Open JP-Ellis opened 3 years ago
I'm just a user, not a dev, but here's my take.
Firstly you should understand that an ArrayView
is not just a pointer to an array. It's a pointer to some data (which may or may not be a full Array
), but possibly using a subset of that data, and possibly with a different stride length, meaning that it could be a view over say every second element of the array. This information is stored within the ArrayBase
struct.
My second general point is that, since all array subtypes are just ArrayBase
with different type signatures, you can and should make your functions generic using these and almost never require any particular array type.
Now to your questions:
Should functions prefer to take ArrayView over the equivalent Array
Neither. As mentioned above, if your function doesn't care what type of array it takes, it should be made generic over ArrayBase<S, D>
. I can't think of any functions that should require an ArrayView
, especially because I believe all array types implement view()
to convert them in a recommended way to an ArrayView
if you somehow need one (e.g. the split_at()
function is only implemented for ArrayView
for some reason).
what about using a pointer vs by value
Follow Rust best practices here, and use the lowest level of ownership you can. If your function doesn't mutate the array, just accept a borrowed ArrayBase
. In your example this would be:
fn sum<A, S, D>(a: &ArrayBase<S, D>) -> A
where
S: Data<Elem = A>,
D: Dimension;
And indeed this is basically how it's implemented in ndarray
, albeit inside an impl
block:
https://github.com/rust-ndarray/ndarray/blob/87506b8bec7cc9a265289beaa56d0b1b58ffad4f/src/numeric/impl_numeric.rs#L19-L49
When muteable data is expected, is there a preference to using ArrayViewMut vs &mut Array?
Remember, if your function takes an ArrayViewMut
, the caller has to transfer ownership of the whole View struct to your function, and can no longer use it. This means throwing away all the subset/stride information. Again, make your functions generic over &ArrayBase
. If only the contents of the array need to be mutated, make it generic over ArrayBase<S, D> where S: DataMut
. Only use a mutable reference if you need to edit the array itself, which I can't think of many use cases for. Maybe if you want to reshape it?
I would expect that for most applications, the ArrayView and ArrayViewMut are preferred as they are more general.
As explained, no, making it generic over ArrayBase
with the appropriate trait is preferred as it's more general again.
Is the fact that they are views into an array mean that they behave more like pointers and thus &ArrayView is redundant?
No. See my first paragraph.
Thanks for your detailed answer :smile: I will refactor some of my code to be a bit more general and use ArrayBase
.
Perhaps there should be some part of the documentation that addresses this to make it clearer? Or if that exists somewhere, I didn't see it.
Note that whether I pass in ArrayViewMut
into a function or pass a mutable pointer to it, Rust should prevent other parts of the code from accessing it elsewhere (even for reading). So that's why I thought it wouldn't matter too much. I agree that the mutable reference would only ever be needed if I wanted to chnage the underlying array and generally modifying data should only require a mutable view into the array (through DataMut
).
Note that whether I pass in ArrayViewMut into a function or pass a mutable pointer to it, Rust should prevent other parts of the code from accessing it elsewhere (even for reading). So that's why I thought it wouldn't matter too much.
This is true, it's not the safety that makes this unadvisable, it's the fact that you're requesting more ownership in your function than you need, and users will be annoyed that they have to re-create the view in their code since your function consumes it.
ndarray
's API in this respect. (See "Good way to express function parameters" under "Motivation".) With ndarray
's current API, I pretty much agree with @multimeric, with the caveat that monomorphization bloat is also a concern, as mentioned in #879. To avoid this issue, it's probably best to use &ArrayBase<S, D>
as the parameter type for the public function, but have the body of the public function immediately call a private, inner function which operates on ArrayView<A, D>
. That gets pretty tedious to write, though. As a result, for my own private code, I often have functions take ArrayView/Mut
parameters just to keep things simple.@jturner314 Thanks for the link to the RFC! This looks like a very promising change for the next major release of ndarray.
This is a great question and, as jturner314 had pointed out, will be addressed by resolving #879. As a result, this will be closed by #1440 when it lands.
I was wondering whether there was a best practive when writing functions that expect arrays, specifically when they are not taking ownership of the array. I couldn't see this being addressed in the documentation or in an existing issue so here's a few things I think could be worth thinking about:
Should functions prefer to take
ArrayView
over the equivalentArray
, and what about using a pointer vs by value? For example, which of the following would be 'best'?When muteable data is expected, is there a preference to using
ArrayViewMut
vs&mut Array
?I would expect that for most applications, the
ArrayView
andArrayViewMut
are preferred as they are more general.Is the fact that they are views into an array mean that they behave more like pointers and thus
&ArrayView
is redundant?