Open tkf opened 4 years ago
@tkf thanks - with the ordering changes in #13 I'm thinking we can do things like ensure tokens have a total ordering and that you can find the first
and last
token as well as a way of finding the (approximate) midpoint and iterating through sub-intervals. Therefore I imagine implementing this in a generic way should be possible.
I haven't really thought through what order such dependencies should take. So far I've been keeping dependencies out because, well, IMO Julia 2.0 deserves an ordered dictionary in Base
as well as enhancements to the AbstractDict
interface, and I intened this to be the playground for that.
Actually, I bit the bullet and defined halve
for Base.Dict
using some implementation details of iterate(::Dict)
. So, I think a similar trick can be used even for "old" HashDictionary
too. I also added halve
spec for unordered collection: https://juliafolds.github.io/SplittablesBase.jl/dev/#SplittablesBase.halve Of course, it'd be great if #13 helps implementing this.
I also want to move SplittablesBase.jl to Base
too. I think we need something like it for composable data-parallel algorithms. So let's hope both of them get into Base
so that we don't have to think about dependency issue :laughing:
Meanwhile, it's possible to implement halve
outside of Dictionaries.jl. But from my experience from implementing iterate(::Dict)
, it requires to touch some internals of the dictionary. So, I think it's better to do it inside Dictionaries.jl. One mid-ground option may be to implement unexported Dictionaries.halve
in Dictionaries.jl so that I can define SplittablesBase.halve(::AbstractDictionary)
at import-time via @require
.
Yes. I will need approximate token bisection for searchsortedxxx
functions when they land, so no worries.
To be clear, yes we can define halve
for any concrete type, I was proposing to have it implemented for all tokenizable AbstractDictionary
. I'm aiming for all the free stuff you get with AbstractArray
by defining axes
, IndexStyle
, getindex
, isassigned
, setindex!
and similar
.
Oh, I see. Yes, it'd be great if there is a way to define halve
on abstract types. It was a bit daunting to think that "complex" collections like dictionaries might need halve
implementation for each concrete type. Token-based API sounds like a good "middleware" for this.
Hi, I just registered a small package SplittablesBase.jl which is used from Transducers.jl. Essentially, you can support parallel reduce and everything that can be derived from it (e.g., map) by defining a single function
halve(collection)
. I wonder if you are interested in defining it.