performance - Githubissues

The difference in allocations is mostly a red herring; if you first call collect on x you'll see the same size allocation for both. The issue is that x[200:800], when x::AbstractRange, can return a range. x[200:800] does almost no work when x is a range (type @edit x[200:800] with x = LinRange(-5,5,1000) to see what I mean).

Currently there is no specialized implementation for FFTView(::AbstractRange) that returns a range object; moreover, because of the periodic boundary conditions you might need to return a concatenation of two ranges, so that would require a novel type.

Try calling x = collect(x) and repeat your test and you'll see (1) that the allocations are identical, and (2) the performance gap has narrowed tremendously.

The remaining difference in performance is because getindex calls reindex, and modrange is insansely expensive because integer division is one of the slowest operations on a CPU. But you need it for the periodic boundary conditions.

If you want to fix this, here are three possible strategies (and they can be applied in combination):

contribute a specialized getindex method for FFTViews of ranges. You'll need to create a DoubleRange type or something. This will get hairy beyond one dimension, though.
contribute a specialized getindex method for range indices. Currently, every index in 200:800 gets modded. However, for a range you only need to apply the boundary conditions once, breaking it up into at most two ranges. This goes beyond the typical AbstractArray interface but there is nothing wrong about supplying higher-order methods as long as you are prepared to deal with the consequences. This will be far easier than the first option to generalize to multiple dimensions.
modify the FFTView struct so that it stores a fast multiplicative inverse so as to avoid integer division. We already do this for our ReshapedArrays, no reason not to do it here.

I'm hesitant to even guess which of these would be easier to implement or more effective for your purposes. The first will be more effective (by far) for the problem you pose, but probably limited to one-dimensional objects. The second will also be really effective, readily extensible to higher dimensions, but applicable to range-indexing only. Applying mod to ranges is easy; the main challenge there will be not breaking Julia's dispatch rules; you'd want to add ambiguity tests and a few more indexing tests to make sure you haven't broken anything for indexing with many different supported types. The third will help your problem a lot, but it still won't get it to the performance of typical arrays; however, it will generically speed up indexing operations with FFTViews for any supplied index types; this approach will run into fewer dispatch challenges because it's purely an internal change.

The best option, of course, is to do all of them, but that would depend on your level of commitment to this problem.

JuliaArrays / FFTViews.jl

performance #17