Open kcajf opened 5 years ago
Yes, that sounds great - the other one I’be thought of is the way indexing a sorted array with a sorted array is also sorted (and we also know that ranges are naturally sorted...). So indexing can basically “preserve” SortIndex
(as well as uniqueness).
When I think of hashes - I think it might be simplest to just drop them (except for maybe uniqueness?). I’m trying to think of compelling end-use cases.
(To say it differently - indexing is definitely in the “roadmap” as it is in my head but it’s been a matter of finding time for me to contribute to this project - and help is always appreciated!)
When I think of hashes - I think it might be simplest to just drop them (except for maybe uniqueness?). I’m trying to think of compelling end-use cases.
What do you mean by this?
I just mean - it seems quite complicated when indexing to preserve the hash table for some (possibly repeated?) subset. I also think it could be slow - if you want to shrink the size of the hash table you need to recompute hashes etc (slow for large selections) and for small selections the overhead of the hash table is large while not providing any acceleration benefits. It seems it might be best to leave the decision in the hand of the user, who can always accelerate
the result when it makes sense for their use case.
Would it be reasonable to encode propagation through getindex
in it's type? Something like HashIndex{Propagating}
?
My use-case is using an accelerated array for the dimensions of a DimensionalArray
(from DimensionalData). I'd like to be able to subset a DimensionalArray
, and have the result still have accelerated indices. I currently don't see a way to enforce this without some type-piracy.
Hi @andyferris - this project looks great, and is something I am considering extending / building on. I already have some hacky implementations of something similar, but not as nicely integrate and as general.
What are you thoughts on adding a family of
getindex(A::AcceleratedVector, idx::AbstractVector{Int})
methods? I.e. indexing intoAcceleratedVector
s with integerStepRange
s, integerVector
s, etc, and returningAcceleratedVector
s of the same type.For
SortIndex
, this would be quite easy. When indexing withStepRange
s, we would have to check the step is positive. For indexing with arrays, we would have to check the array is sorted first. Since we would be constructingSortIndex
es from known-sorted arrays, it would be good to have aSortIndex
constructor that doesn't do any checks. This might also be useful generally.For
HashIndex
, the hash table would have to be modified, but there are likely lots of optimisations / shortcuts to minimise the work. For example, when indexing withUnitRange{Int}
s, we could add to a global integer offset that is subtracted from the values in theDict
when they are accessed. Similar to the above, we might want a more directHashIndex
constructor that accepts a pre-built dictionary & offset, etc.I'd be happy to do some initial work on this after hearing your thoughts.