davidavdav / NamedArrays.jl

Julia type that implements a drop-in replacement of Array with named dimensions
Other
120 stars 20 forks source link

views on NamedArrays are slow #49

Open mkborregaard opened 7 years ago

mkborregaard commented 7 years ago

Applying view on a NamedArray tends to be slow, probably because the dimnames Dicts are recalculated. I had a bit of trouble generating an MWE, but on a big (10.000 x 18.000) sparse Boolean NamedMatrix a I had @benchmark view($a, :, :) benchmark at 7 ms, while @benchmark view($(a.array), :, :) benchmarked at 20 ns, i.e. a speedup of a factor 300.000!

davidavdav commented 7 years ago

It looks like the code in view(::NamedArray) is a bit of a hack to get it going, not understanding much of the workings of view().

Base.view{T,N}(n::NamedArray{T,N}, I::Vararg{Any,N}) = namedgetindex(n, map((d,i)->indices(d, i), n.dicts, I)...; useview=true)

There is a map() going on, and further, namedgetindex() does quite a bit of logic to recompute the names of the sliced arrays.

If view has to be efficient for NamedArrays, this would probably need a lot of rework.

mkborregaard commented 7 years ago

Thanks for the heads up! I think for my purposes I may refactor my package to use normal Arrays at the moment, as performance is important. I'm not even sure it can be made very efficient, as the key to the quick namedgetindex lies in the names Dicts being updated, right? Maybe it's simply a tradeoff of quick views vs quick named getindex.

davidavdav commented 7 years ago

I think the whole idea about a view is that only indices are recomputed in accessing individual elements in the view. So view(NamedArray(::Array)) should probably translate NamedArray(view(::Array), view(dicts)) somehow.

mkborregaard commented 7 years ago

Yes absolutely, but the values in the dicts are indices that become invalidated by the view, right? Or am I misunderstanding the design?

davidavdav commented 7 years ago

It would be possible for a name to stay in the (unchanged) dict, but point to an index outside the view. I would say I would have to study the internals of view() for hooking into it, and generate the appropriate error. The reverse---going from view indices to names would probably be even more complicated, like what is needed for names(view(::NamedArray)). I would hope a proper implementation would not require duplicating most of the indexing code for NamedArrays.