Routines to implement - Githubissues

arteymix commented 8 years ago

I think it would be nice to wrap up what routines we want to have in Array.

View-like routines

[x] head and tail which can be based on slice
[x] flip to reverse a dimension (multiply corresponding stride by -1)
[ ] reshape covered in #7
[ ] flatten covered in #11
[ ] ravel (like flatten, but contiguous, order is lost, basically shape[0] = size)
[ ] view_as to change the scalar_type and scalar_size

Other should be inspired from NumPy: https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html

arteymix commented 7 years ago

Most of these routines are not possible to realize with views. I don't really like the idea of either returning a copy or a view because it's not strictly equivalent if we can mutate the array.

Should we work these out in-place?

rainwoodman commented 7 years ago

In-place works when the array can be permuted to be trivailly iterable. But if there is stepping in the slicing it won't really work.

For a starter, what about

for most methods, if a view cannot be returned we will raise an error.
flatten() is one of those methods, dies if the array cannot be flatten.
ravel() will always return a copy. never a view.

To really close the loop. We need to look into 'indirect' arrays. where we save a pointer or an offset for each element in the array; numpy has explicitly avoided dealing with such arrays. It was defined in Python buffer protocol I believe. Maybe we want to support it. It will allow very twisted views of an array.(e.g. sorting by a space-filling curve).

arteymix commented 7 years ago

By in-place, I meant moving the array content into itself. However, that might not be a good idea because it will likely break the parent's view.

Indirection array could be really nice if we manage to keep them compact (e.g. dense array of gpointer to a dense array of values). In the end, it's just a special cast of get_pointer and set_pointer. It could either be built-in or a subclass of Array. That would allow us to cover all the possible transformation in a view-like fashion and we could use that to perform non-trivial ones.

arteymix commented 7 years ago

We could make a special case for gpointer array as well and implicitly perform indirection in get_pointer and set_pointer.

rainwoodman commented 7 years ago

I would prefer keeping them as ptrdiff_t, as well as keep a reference to the data object of the parent.

Even in that case it is a bit strange. For strides we know we shrink the support of the array object every time we play with the view. For indirect array this is no longer true. We can convert from strided array to indirect array, but the other way arround is usually impossible; then we can always stride an indirect array (just stride the index).

This is messy!

arteymix commented 7 years ago

Yeah, we'll have to think about that. If we start using indirection, it will compromise memory efficiency. It's nice though because it preserves the view-like nature without relying on a copy.

Maybe we could have some sort of copy and compact feature? Having too large strides can become ineffective and for relatively small subset, we might like to copy into compact memory:

var b = array.step ({1000}).recompact (); // b contains a copy aligned in 'scalar_size'

I'll try to implement the trivial cases for the said routines and raise error otherwise. Then we'll see how we want to handle other cases.

rainwoodman commented 7 years ago

Yes.

In numpy I usually use .copy() for this. The a new copy in numpy is always C-contiguuos by default.
Looks like I shall refactor the value_xxx functions from the PR and roll back the name changes?

arteymix commented 7 years ago

Ok, let's have view-like and in-place routines with copy to apply it in a non-destructive fashion. This way, we'll never implicitly allocate memory (outside from copy).

For subclasses of Array, we could make copy virtual so that it can be overwritten properly.

The in-place could return the current instance for chaining.

Thus,

var b = a.copy ().reshape ({2, 3}); // returns the copy of 'a'

arteymix commented 7 years ago

Yeah, sure, you could do that. We can change the routines name later if we really need to.

rainwoodman / vast

Routines to implement #14

View-like routines