FluxML / MLJFlux.jl

Wrapping deep learning models from the package Flux.jl for use in the MLJ.jl toolbox
http://fluxml.ai/MLJFlux.jl/
MIT License
145 stars 17 forks source link

MLJFlux super slow / stackoverflow #185

Closed Moelf closed 3 years ago

Moelf commented 3 years ago
using MLJ

N = 10^4
y = rand([0,1], N)
x1 = Float32.(y .+ 1)
df = DataFrame(y = y, 
    x1 = x1, 
    x2 = x1,
    x3 = x1,
    x4 = x1);

y, X = unpack(df, ==(:y), !=(:y); :y=>Multiclass{2});

CLS = @load NeuralNetworkClassifier
nnc = CLS(epochs = 1)

mach = machine(nnc, X, y)

@time fit!(mach)
 48.405561 seconds (101.52 M allocations: 6.914 GiB, 2.21% gc time, 0.46% compilation time)

and it stackover flows if you set N=1850786

ablaom commented 3 years ago

@Moelf Thanks indeed for reporting. This behaviour is terrible.

Have identified the culprit in the way one-hot encoding is being done. Working on a fix now.