Closed MasonProtter closed 7 months ago
I'm a bit reluctant in removing the signature of the input functions (AbstractArray
), because that will make error messages harder to understand. But these are the minimal requirements:
julia> using ChunkSplitters
julia> struct MinimalInterface end
julia> Base.firstindex(::MinimalInterface) = 1
julia> Base.lastindex(::MinimalInterface) = 7
julia> Base.length(::MinimalInterface) = 7
julia> x = MinimalInterface()
MinimalInterface()
julia> collect(chunks(x; n=3))
3-element Vector{StepRange{Int64, Int64}}:
1:1:3
4:1:5
6:1:7
julia> collect(chunks(x; n=3, split=:scatter))
3-element Vector{StepRange{Int64, Int64}}:
1:3:7
2:3:5
3:3:6
What do you think? It is perfectly possible to remove the signature and document that the interface requires firstindex
, lastindex
, and length
only. Do you think that's useful?
(actually in terms of implementation I think the error messages would be better if they failed at those functions of the interface). I think I'lll release that right away.
It is done in the 2.2.0 branch: https://github.com/m3g/ChunkSplitters.jl/tree/release-2.2.0
If you don't have any suggestion, I'll release it (it will make a contract relative to the interface, so if you find that something can be even further simplified, let me know).
The only downside of this is that now chunks
works on scalars, which I find a bad thing, but that's just how Julia is.
@carstenbauer if you have something to suggest here, please do.
Oops sorry I missed this @lmiq! I think I'm in favour of this. In particular, this is nice because it lets you take chunks of chunks!
Pinging @carstenbauer again here, maybe he missed it to. If you all are ok, I can release that asap.
One thing we could do if you want better error messages is do something like
is_chunkable(x) = false
is_chunkable(x::AbstractArray) = true
is_chunkable(x::Chunk) = true
is_chunkable(x::Tuple) = true
function chunks(x; n, split=:batch)
if !is_chunkable(x)
error("something informative")
end
....
end
and then people can override is_chunkable
for their types if they are able to support it
Just took a quick look and LGTM. Apart from Mason's suggestion I only have a minor stylistic comment: I would probably rename the function argument / field name array
to data
or whatever because it might not be an array after all.
Thanks! Yesterday I wasn't able to do anything. But all is fine!
I know it'd be a lot of work for ChunkSplitters to systematically try to support non-AbstractArray types, but it'd be good if there was some docs guidance for other people to try and implement methods so that their types can be supported as well.