JuliaML / MLUtils.jl

Utilities and abstractions for Machine Learning tasks
MIT License
107 stars 22 forks source link

add a type-stable `unstack` method with `Val` dims arg #150

Closed gabrevaya closed 1 year ago

gabrevaya commented 1 year ago

Addressing https://github.com/JuliaML/MLUtils.jl/issues/149.

codecov-commenter commented 1 year ago

Codecov Report

Merging #150 (ef9523d) into main (c957026) will increase coverage by 0.05%. The diff coverage is 100.00%.

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main     #150      +/-   ##
==========================================
+ Coverage   88.40%   88.46%   +0.05%     
==========================================
  Files          13       15       +2     
  Lines         595      598       +3     
==========================================
+ Hits          526      529       +3     
  Misses         69       69              
Impacted Files Coverage Δ
src/utils.jl 90.14% <100.00%> (+0.06%) :arrow_up:

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

ToucheSir commented 1 year ago

Instead of forcing it to be a positional arg, you can do what Base does with typed_cat and call an internal function with dims as a positional arg from unstack(...; dims). Then it's possible to add dispatches to the former for Val and more.

gabrevaya commented 1 year ago

I was trying to copy the idea from cat_t in Base, as you suggested (I didn't find any function called typed_cat, so I guess you meant this one). Although I managed to dispatch to a Val method, now the type is not being inferred correctly. I still don't fully understand the logic of this (it's the first time I'm using Val). This is what I have so far:

@inline unstack(xs...; dims) = _unstack(dims, xs...)
_unstack(dims, xs...) = _unstack_t(dims, Base.promote_eltypeof(xs...), xs...)
dims2unstack(::Val{dims}) where dims = dims2unstack(dims)
dims2unstack(dims::Integer) = dims

@inline function _unstack_t(dims, ::Type{T}, xs...) where {T}
    unstackdims = dims2unstack(dims)
    [copy(selectdim(xs..., unstackdims, i)) for i in 1:size(xs..., unstackdims)]
end

I'll continue trying to make this work tomorrow.

ToucheSir commented 1 year ago

Given unstack only takes a single xs, you can probably eliminate most of the varargs and splats. Something like this:

@inline unstack(xs; dims) = _unstack(dims, xs)

@inline dims2unstack(::Val{dims}) where dims = dims2unstack(dims)
@inline dims2unstack(dims::Integer) = dims

@inline _unstack(dims, xs) = [copy(selectdim(xs, dims2unstack(dims), i)) for i in axes(xs, dims2unstack(dims))]

I don't think @inline is required on all functions, so maybe try selectively removing it. Calling dims2stack inside the array comprehension is very important though, because it lowers to a Generator call which shoves the body into a closure (thus breaking const prop of unstackdims).

gabrevaya commented 1 year ago

The only check that that failed was on Julia nightly and doesn't seem to be related to the changes in this PR. Should I do anything else?

gabrevaya commented 1 year ago

Thanks! I commited your suggestion for removing that @inline and I see that you approved the PR but it is still not merged. Do I have to do something else or another reviewer is needed?

ToucheSir commented 1 year ago

Wait for one of the maintainers to have a break from or finish work? If you don't hear anything after 24-48 hours, that's a better time to bump.

gabrevaya commented 1 year ago

OK, I apologize; I didn't mean to rush or bother you. I just wasn't sure if there was anything else I needed to do on my end. I thank you very much for your assistance, and I'm truly sorry for any inconvenience.

ToucheSir commented 1 year ago

No need to apologize. My point is that open source dev means that people are all over the world with all sorts of time commitments. Longer lead times and very async communication should be the default expectation.