Closed pgagarinov closed 3 years ago
Thanks for reporting this! I realized that I hadn't incorporated epsilon buffers for stability into some gradient calculations, which I suspect is the root cause for those NAs. I should be able to validate that later today.
@pgagarinov I've just pushed a PR that I think should partly solves the issue you encountered. Epsilons were added to the gradients calculations and provide some remedy. However, I did encountered some cases where the model still overflowed, the reason I suspect being the accumulated predictions from all the trees reaching an overflow. Having a mechanism to cap the total prediction remains to be considered.
In order to limit the risk of encountering such issues, I'd recommend using the algorithm on Float64, which can be set by specifying T=Float64 in the parameters. With v0.7.2 that I just pushed, Float64 is now the default instead of Float32.
Also, for the case classification with 2 outputs, I noticed that the logistic regression appears to converge and train faster that the softmax / multi-class approach. This requires however to pass y
as a numeric / float:
EvoTreeRegressor(T = Float64,
loss=:logistic, metric = :logloss,
nrounds=100, nbins = 100,
λ = 0.5, γ=0.1, η=0.1,
max_depth = 6, min_weight = 5.0,
rowsample=0.5, colsample=1.0)
@jeremiedb Thanks for fixing this, I'll test this and let you know if this helps.
@pgagarinov There's been further improvements to stability, speed and memory consumption in version 0.8.0. I'd assume it would resolve the current issue. Let me know otherwise.
The last line of the following block throws an exception
The problem seems to be related to https://github.com/alan-turing-institute/MLJBase.jl/issues/525
Here is the exception.
This behavior creates a real problem when doing a hyper-parameter search as per https://alan-turing-institute.github.io/MLJ.jl/stable/#Lightning-tour-1
The data is attached train.zip