Implement non-axis windows and chunks iterators

Robbepop commented 7 years ago

Already implemented features and todos:

[x] ExactChunks and ExactChunksMut aswell as their NdProducer impl
[ ] RaggedChunks or simply Chunks aswell as their NdProducer impl
[x] Windows
[x] NdProducer impl for Windows

Implement n-dimensional windows and chunks iterators.

Windows-iterators are especially useful for convolutional neural networks while chunk iterations may prove useful for paralell algorithms.

N-Dimensionality for both iterators use N-dimensionally shaped items.

For example for a 1-dimensional Array the windows and chunks items are also 1-dimensional. For a 2-dimensional Array the windows and chunks items are 2-dimensional, and so on.

For windows iterator it could be useful to support two different kinds of windows iterators. One for "valid" windows only that iterate only over windows within a valid region. For example in a valid-windows iterator over a 10x10 shaped 2d-array and window sizes of 4x4 the iterator would iterate totally over (10-4+1)x(10-4+1) items of size 4x4. A full-windows iterator would also iterate over the windows with invalid elements (on the edges of the array) and would simulate missing elements with zero elements.

Semi-nice visualization of n-dimensional chunks and windows iterators.

A similar approach could be used for n-dimensional chunks that iterate beyond the array sizes and replace missing/invalid array elements with zero elements.

bluss commented 7 years ago

That sounds good, and I love a good illustration to go together with expressing an idea.

One concern I have here is that it's probably still not an efficient way to implement convolution, to go through this.

Robbepop commented 7 years ago

That sounds good, and I love a good illustration to go together with expressing an idea.

Thanks! :)

One concern I have here is that it's probably still not an efficient way to implement convolution, to go through this.

True! For performance the most promising implementation for convolution is to use a GPGPU approach. But what if one doesn't want to go that far? The main question for this issue for me is, if it is possible to provide such iterators with equal performance as extern definitions. But as far as I can imagine such iterators would perform much better if implemented internally in this crate with hidden data of ArrayBase.

Also: What implementation would fit best? We could implement such iterator for example by using an item-sized local Array within the iterator to reuse the memory for copying the data out of the nd-array.

Or we could also use iterators over iterators over the nd-array and thus avoid copying of elements completely but I don't know if this approach is faster since I do not know ndarray internals too well. As far as I can think the strike-throughed idea is not feasable if we want ArrayView as items for the new iterators.

bluss commented 7 years ago

Between as_ptr, as_mut_ptr, and the .axes() iterator (each axis' length and stride), all the information of the raw representation is available, except for issues related to ownership, and they don't matter here. So I think it's possible to do it outside of the crate, but not so natural.

Robbepop commented 7 years ago

Between as_ptr, as_mut_ptr, and the .axes() iterator (each axis' length and stride), all the information of the raw representation is available, except for issues related to ownership, and they don't matter here. So I think it's possible to do it outside of the crate, but not so natural.

Okay, good to know about the currently possible "hack", but yeah it wouldn't be a very natural way to encode this behavious and would require unsafe rust to deref the pointers which is unnecessary.

For me both proposed iterators would be great and would be of great use for my projects!

bluss commented 7 years ago

Sure, the window iterator is a good idea. If I understand correctly, chunks has a chunk size in each dimension? .axis_chunks_iter() currently covers it if you just want to chunk it along one dimension, (using the full width of every other).

I also want to underline that using the unsafe methods or lower level code is not a hack but a feature, we need to allow it so that people can use it to improve their programs if needed.

Robbepop commented 7 years ago

For my proposed idea I cannot use the axis_chunks_iter since it has a different behaviour. The items of the chunks iterator I proposed has the same dimension as the underlying Array. It just acts like the 1-dimensional chunks iterator for slices but for n-dimensions.

For 2-dimensions I could really use this for the pooling function within the convolution.

And for a 3 dimensional array of size 10x10x10 I could for example chunk it with 5x5x5 cubes and would iterate over (10³ / 5³) chunks in total.

For both iterators it should be decided whether we also are in need of full-iterator that (as described) above iterate over the full area, instead of only the valid area. This is possible since in ndarray elements are all NDFloat which is Copy and has a default value 0.0 which we could simply use for invalid elements.

bluss commented 7 years ago

axis_chunks_iter gives views that have the same dimensionality (1d → 1d, 2d → 2d) though.

Robbepop commented 7 years ago

axis_chunks_iter gives views that have the same dimensionality (1d → 1d, 2d → 2d) though.

Okay I will have to take another look into the docs. Maybe I have overseen things and it could really be used to simulate the chunks iterator that I proposed here.

Robbepop commented 7 years ago

I have found a semi-nice visualization of the proposed chunks iterator for 2 dimensions here.

bluss commented 7 years ago

Right, the current axis_chunks_iter cuts stripes, not squares. So they are different in that way.

Robbepop commented 7 years ago

How do you feel about the importance of a chunks iterator facility for ndarray?

bluss commented 7 years ago

Not important, but we can try to do it

Robbepop commented 7 years ago

How do you feel about me trying to do it via PR? (This would probably take a while since I have to wrap my head around the libraries internals.)

bluss commented 7 years ago

Go ahead.

Remember how windows for slice has no mutable variant? That needs to be true for us too.
Maybe I should tell about the trick that we usually use for iterators: We reuse the basic element iterator (Baseiter) . How to construct a chunks iterator? Make an iterator that visits exactly the top right corner of each chunk (This can be done using slicing with a stride, which we can also emulate manually). For each element, you map it to an ArrayView that starts in that element (its pointer is the arrayview's pointer!)

Robbepop commented 7 years ago

Okay I will try! :)

Robbepop commented 7 years ago

@bluss

Today I looked into the ndarray code to wrap my head around the iterator and especially the BaseIterator and ArrayBase code.

It is not yet very clear to me but could it be possible to design a windows or chunks iterator without any copy operations into an iterator-internal owning ArrayBase object by exploiting the strides of the ArrayBase and just giving out cheap instantiations of ArrayViews that have a proper ptr and strides?

I need feedback on this.

bluss commented 7 years ago

I recognize it's not super easy to implement a new iterator, since it's so much about ndarray implementation details.

I think you would want do do something like InnerIter: have a Baseiter and also extra information. However for chunks/windows it needs to have a whole dimension/stride set, not just a single number (Basically D instead of an Ix). https://github.com/bluss/rust-ndarray/blob/master/src/iterators/mod.rs#L400

Robbepop commented 7 years ago

Yep I got that point with the D: Dimension strides for the ArrayView but I was just curious if the whole thing is really implemented in that way as I thought that we have to store an internal copy of all the floats within the current iteration window or chunk but with this architecture all that copying is not needed and thus the whole iteration should be a lot more efficient than I thought in the first place. ^.^

Let's see if I can get it working. :S

bluss commented 7 years ago

If I think of the chunks iterator and am not mistaken I think that

We use sliced dimension and sliced stride to initialize Baseiter (One can slice in place using the array method islice, but it can be done with lower level functions as well). Each yielded ArrayView will have the dimensions of a chunk (clipped to the full array's boundaries) and strides that are equal to those of the original array(!)

bluss commented 7 years ago

This code could also be read to understand how to use a slice / subview of an array as a “map” of array views.

Imagine we have an array of N dimensions and want to cut it into separate one-dimensional “sticks” (or “lanes”).

That's what these methods do. Only difference is they use a callback instead of an iterator.

https://github.com/bluss/rust-ndarray/blob/d9492d7b5812e47dfeb23d062f2acd33a62b41cd/src/impl_methods.rs#L1371-L1386

Robbepop commented 7 years ago

I think I understood how I can utilize islice to create the iterators. On iterator instantiation I create an ArrayView of the n-dimensional Array under iteration and with islice I mutate the internal ArrayView and give out references to it on a call to next.

I could also use slice method to create a new instance of the ArrayView instead of mutating an old one. The advantage here would be that I could easily move the newly created ArrayView on a call to next and would yield values instead of references on iteration which would be nice in case of non-owning ArrayViews.

The part that is currently unclear to me is, how I can perform the n-dimensional iteration of all possible Slices of the chunked or windowed Array. Is there built-initerator functionality for that to yield all possible indices of the array? - Is the Indices iterator the right utility for this approach?

We need an efficient way for iterating over all possible slices or indices. I saw that the iteration Item for Indices is a Vec<usize> which implies tons of potential malloc invokations. We should avoid that I think.

bluss commented 7 years ago

The iteration Item for Indices is D::Pattern where D is a dimension.

That means if D is Ix2, the item is (usize, usize). If D is IxDyn, dynamic number of axes, the item is indeed Vec<usize>. It's now IxDyn works, it uses a lot of dynamic allocations, but it's not the usual way you'd use ndarray. Ix1, Ix2 etc are the common ones.

bluss commented 7 years ago

This is maybe not so straightforward, you could leave this to me if you want.

Robbepop commented 7 years ago

That means if D is Ix2, the item is (usize, usize).

That's awesome!

This is maybe not so straightforward, you could leave this to me if you want.

I would like to try it out now if this is okay for you.

So I think I can design Windows like so:


struct Windows<'a, A: 'a, D: 'a + Dimension> {
    /// view of the entire array under iteration
    view   : ArrayView<'a, A, D>,
    /// iterator for all possible valid window indices
    indices: Indices<D>,
    /// the shape of the windows
    shape  : Shape<D>
}

pub fn windows<'a, A: 'a, D: 'a + Dimension>(array: ArrayView<'a, A, D>, shape: Shape<D>) -> Windows<'a, A, D> {
    Windows{
        view:    array.view(),
        indices: indices_of(&array), // restrict array to area for valid windows shapes:
                                     // e.g. array.shape() == [10, 8], window shape is [5, 4], then indices, area
                                     // should be (0..5, 0..4).
        shape:   shape
    }
}

impl<'a, A: 'a, D: 'a + Dimension> Iterator for Windows<'a, A, D> {
    type Item = ArrayView<'a, A, D>;

    fn next(&mut self) -> Option<Self::Item> {
        match self.indices.next() {
            None => None,
            Some(idx) => {
                let slice_arg = slice_at_idx_of_shape(self.shape, idx); // function that converts a shape and a given index into a SliceArg
                Some(self.view.slice(slice_arg)) // return window with given shape at current index
            }
        }
    }
}

Questions for me to solve:

How to restrict the indices iteration area to valid windows for given shape s.
How to convert a given idx in next into a shape of the given shape `s.

How close or far am I from your theory behind the codes for the windows iterator?

To create a windows-slice for slice method from a given shape Dim and an index (offset) Dim without dynamic memory allocation in the special case of 2 dimensions I came to this code:

fn slice_at_idx_of_shape_2d<D: Dimension>(shape: D, idx: D) -> [Si; 2] {
    [
        Si(idx[0] as isize, Some((idx[0] + shape[0]) as isize), 1),
        Si(idx[1] as isize, Some((idx[1] + shape[1]) as isize), 1)
    ]
}

Provided that this implementation is correct it would be best imo to create a 1D, 2D, ... 6D version for it and a generalize dynamic allocation requiring method to avoid dynamic allocation up to 5 dimensional arrays. With strides we might be able to easily implement chunks-iterators.

Robbepop commented 7 years ago

I have created a prototype implementation of a generic, malloc avoiding into_si_with_offset trait implementation which can be found here.

This makes it possible to map a pair of shape: Dim (windows/chunks size) and offset: Dim (from indices) into a slice argument for slicing.

bluss commented 7 years ago

Making SliceArg available like that is good

bluss commented 7 years ago

I have sort of gone into this territory with Zip, a chunk iterator might show up there.

Robbepop commented 7 years ago

I haven't had quite enough time to work on this the last few days. It is nice that you also worked on another path there which looks really promising! Maybe I can actually use this zip a lot in order to improve already existing Prophet code.

I have sort of gone into this territory with Zip, a chunk iterator might show up there.

Does that mean that you are able to implement a Chunks iterator using this new facility?

I came to the conclusion that a Windows iterator is a lot easier to implement than a Chunks iterator. Basically the template you just gave me needs only little changes for it to be a valid Windows iterator.

bluss commented 7 years ago

What I want now with Zip is that one can zip together chunk producers and arrays.

If an array is dim (10, 10) and we get a chunks producer with (2, 2) chunks, the chunks producer has dim (5, 5) and we can zip it with other (5, 5) arrays.

Iterators are stateful, and handling the halfway-iterated state is troublesome. So I'm going the route (like rayon) of having “Producers” (Think of that as IntoIterator for the array zip). A producer can be split etc for parallelization. Some objects like Chunks is a Producer and IntoIterator, other objects like AxisIter can implement Producer and Iterator directly (because the axis iter is one dimensional, this is not a problem).

Robbepop commented 7 years ago

Sounds very nice so far - especially the improved integration with rayon! The question that I am asking myself is: Do you still need a Chunks and Windows iterator implementation after this huge commit or is it self-contained in it?

If you no longer need it, I can at least serve you with the test cases implementation I came up with. There are many for chunks: (just for 2 dimensional arrays so far)

only even chunks (easy one)
odd chunks on the right handside (horizontal) (hard)
odd chunks on the vertical axis (hard)
odd chunks on both axes
chunks of size 1x1 -> should be same as normal iteration. So should this do some warning on use?
chunks of size 0 -> error!
chunks of size greater than the underlying array -> what is the expected behaviour for this?

Odd chunks are shaped so fit into the remaining valid region, as in the standard chunks iterator.

We could maybe also implement an EvenChunks iterator besides this one, that would simply skip odd chunks or panic on misusage (what you like better) which should give us some performance when we do not need to cover these many edge cases. What do you think?

bluss commented 7 years ago

I've only implemented the "EvenChunks" case so far (I called it WholeChunks to be less ambiguous). Basically it was there because I needed it to ensure that the trait for n-ary Zip could support it.

I think that yes we want at least the windows iterator (and windows producer!)

bluss commented 7 years ago

chunks of size greater than the underlying array -> what is the expected behaviour for this?

For whole chunks, the result must be 0 chunks (all chunks produced have the requested size), for ragged chunks it would be as many ragged chunks that fit. If the chunk is larger than the array in all dimensions, the result is 1 chunk, surely? Not sure what to do if the source array is size 0.

Robbepop commented 7 years ago

How do you like working with the Result type in such cases? Could it be worth returning a Result<Chunks> instead of Chunks iterator directly in cases where it doesn't make sense to create such an iterator, like in the case of a zero sized array? Or should we just return a ragged chunk of size zero? Or panic on zero sized arrays?

I personally thing the cleanest solution would be to simply return a zero sized ragged chunk, however, this is not the best solution to point a user to a logic error if this scenario is the result of one.

bluss commented 7 years ago

If it's a logic error and programmer's error we have a convention to just panic; for example [].chunks(0) panics and [1, 2, 3][10] panics. Result is preferred for situations when errors is part of the regular use of the API.

I'm not sure zero sized arrays are common.

Robbepop commented 7 years ago

I think I have prototyped an implementation for the Windows iterator based on the IterBase - was actually easy compared to the Chunks iterator with all its edge cases.

The implementations for the Windows and Chunks iterators will be splitted into two different PRs and the more difficult Chunks iterator will follow soon, hopefully.

I am going to write some tests for the Windows iterator now to verify that it is working as far as the tests are verifying it.

bluss commented 7 years ago

Great! I wrote some chunks tests for whole chunks (the easy version), maybe those can inspire. tests/iterator_chunks.rs

I'm wishing that we make an NdProducer out of Windows, so that we can use it in azip.

Robbepop commented 7 years ago

I have inspected your WholeChunks iterator implementation. Maybe with your code as template I can get NdProducer working with my Windows implementation. I also looked at your WholeChunks test - looks kinda complicated to me. My tests would look more like that:

#[test]
fn test_windows_iterator_simple() {
    let a = Array::from_iter(10..30).into_shape((5, 4)).unwrap();

    let mut iter = windows(a.view(), Dim((3, 2)));
    assert_eq!( iter.next().unwrap(), arr2(&[ [10, 11], [14, 15], [18, 19] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [11, 12], [15, 16], [19, 20] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [12, 13], [16, 17], [20, 21] ]) );

    assert_eq!( iter.next().unwrap(), arr2(&[ [14, 15], [18, 19], [22, 23] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [15, 16], [19, 20], [23, 24] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [16, 17], [20, 21], [24, 25] ]) );

    assert_eq!( iter.next().unwrap(), arr2(&[ [18, 19], [22, 23], [26, 27] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [19, 20], [23, 24], [27, 28] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [20, 21], [24, 25], [28, 29] ]) );

    assert_eq!( iter.next(), None );
}

I wanted to (KISS) keep it simple stupid. :S

bluss commented 7 years ago

Looks pretty good. I'm trying to learn to write more “exhaustive” style tests than just inputting examples. Your tests look good as they are, I'd probably use itertools::assert_equal to test the iterator's sequence though.

bluss commented 7 years ago

So for example I'd prefer a test that tested all possible window sizes, since that's more exhaustive.

Robbepop commented 7 years ago

Looks pretty good.

Okay thanks!

I'd probably use itertools::assert_equal to test the iterator's sequence though.

Awesome! Didn't know about its existance. :)

So for example I'd prefer a test that tested all possible window sizes, since that's more exhaustive.

True! I will first implement some naive implementations and later add some exhaustive tests. I would also like to test other dimensionalities, such as 3D.

Robbepop commented 7 years ago

#[test]
fn test_windows_iterator_simple() {
    let a = Array::from_iter(10..30).into_shape((5, 4)).unwrap();
    itertools::assert_equal(
        windows(a.view(), Dim((3, 2))),
        vec![
            arr2(&[ [10, 11], [14, 15], [18, 19] ]),
            arr2(&[ [11, 12], [15, 16], [19, 20] ]),
            arr2(&[ [12, 13], [16, 17], [20, 21] ]),

            arr2(&[ [14, 15], [18, 19], [22, 23] ]),
            arr2(&[ [15, 16], [19, 20], [23, 24] ]),
            arr2(&[ [16, 17], [20, 21], [24, 25] ]),

            arr2(&[ [18, 19], [22, 23], [26, 27] ]),
            arr2(&[ [19, 20], [23, 24], [27, 28] ]),
            arr2(&[ [20, 21], [24, 25], [28, 29] ])
        ]);
}

Robbepop commented 7 years ago

All tests are passing now but this one:

/// Test that verifites that no windows are yielded on oversized window sizes.
#[test]
fn windows_iterator_oversized() {
    let a = Array::from_iter(10..37)
        .into_shape((3, 3, 3))
        .unwrap();
    let mut iter = windows(a.view(), Dim((4, 3, 2))); // (4,3,2) doesn't fit into (3,3,3) => oversized!
    assert_eq!(iter.next(), None);
}

How is this functionality achieved in itertors using BaseIter in general? I do not want to implement it via a hack ...

bluss commented 7 years ago

Just make sure the dimension of the iterator is computed to be (0, 1, 2) which means 0 elements

Robbepop commented 7 years ago

Ah nice, I got screwed by this because of another bug with underflowed subtraction. All tests are working now.

bluss commented 7 years ago

Sorry for all the churn around NdProducer and so on. It's a big project to make it work.. and a lot needs to change around it. Let me know, I'm fine with merging a windows iterator and fixing it to be ndproducer.

Robbepop commented 7 years ago

Sorry for not notifying you on the status of the current windows implementation and tests. I haven't had that much time in the past days ... I will make a PR today if I can get it working. =)

Robbepop commented 7 years ago

PR completed #306 - awaiting feedback

Robbepop commented 7 years ago

I have added a task-list for the todos until this issue can finally be closed. We should discuss if we still need and/or want the RaggedChunks iterator.

bluss commented 7 years ago

Good. Lists are good. Checking things off is the next best thing to closing issues :wink: Thanks for working on Windows and working with ndarray in general.

Robbepop commented 7 years ago

Thanks! :)

And thank you for maintaining all these awesome, high-quality libraries that are critical for the well-being of the rust ecosystem!

rust-ndarray / ndarray

Implement non-axis windows and chunks iterators #276

Already implemented features and todos: