FluxML / Flux.jl

Relax! Flux is the ML library that doesn't make you tensor
https://fluxml.ai/
Other
4.43k stars 598 forks source link

Zero-sized arrays cannot be applied to Dense layers. #2407

Open bicycle1885 opened 3 months ago

bicycle1885 commented 3 months ago

I have encountered an issue when attempting to apply a zero-sized array to a Dense layer in Flux.jl. The operation fails due to a division by zero error, as the layer does not handle zero-sized input arrays correctly.

julia> using Flux

julia> dense = Dense(4 => 5); x = randn(Float32, 4, 0, 6);

julia> dense(x)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ ./int.jl:295 [inlined]
 [2] divrem
   @ ./div.jl:201 [inlined]
 [3] divrem
   @ ./div.jl:179 [inlined]
 [4] _reshape_uncolon
   @ ./reshapedarray.jl:128 [inlined]
 [5] reshape
   @ ./reshapedarray.jl:119 [inlined]
 [6] reshape
   @ ./reshapedarray.jl:118 [inlined]
 [7] (::Dense{typeof(identity), Matrix{Float32}, Vector{Float32}})(x::Array{Float32, 3})
   @ Flux ~/.julia/packages/Flux/Wz6D4/src/layers/basic.jl:179
 [8] top-level scope
   @ REPL[4]:1
mcabbott commented 3 months ago

What output would you like here? There is no data, of course:

julia> x = randn(Float32, 4, 0, 6)
4×0×6 Array{Float32, 3}

If you're trying to propagate size information, then see Flux.outputsize.

bicycle1885 commented 3 months ago

The expected output is an array of size (5, 0, 6) in this example. Of course, there is no data but I sometimes slice an array with x[:,i+1:i+k,:], where k is non-negative and hence may be zero. I don't want to do special handling in such a case.

bicycle1885 commented 3 months ago

Also note that the following works as expected. The behavior should be consistent with this.

julia> dense = Dense(4 => 5); x = randn(Float32, 4, 0);

julia> dense(x)
5×0 Matrix{Float32}
mcabbott commented 3 months ago

Yes that's not crazy, I suspect that this one could be made to work without too much effort. But there are likely to be many other places where similar things assume nonempty input, and I don't relish the thought of tracking them all down.

julia> Conv((2,2), 1=>3)(randn32(10,10,1,0))
ERROR: TaskFailedException

    nested task error: ArgumentError: cannot create partitions of length 0

julia> GlobalMeanPool()(randn32(10,0,1,2))
ERROR: DivideError: integer division error