Input contains NaN error when doing linear_l2 model - Githubissues

sb-ai-lab / LightAutoML

Fast and customizable framework for automatic ML model creation (AutoML)

https://developers.sber.ru/portal/products/lightautoml

Apache License 2.0

1.09k stars 48 forks source link

Input contains NaN error when doing linear_l2 model #75

Closed RishatZagidullin closed 1 year ago

RishatZagidullin commented 1 year ago

🐛 Bug

On some multiclass tasks the linear model throws the following error:

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

called from site-packages/sklearn/utils/validation.py.

To Reproduce

Steps to reproduce the behavior:

Unarchive the issue.zip folder;
Place it in the LightAutoML directory and cd to issue folder;
run python ./lama_cpu.py -p ./data/ -k sf-crime -f 2 -n 4 -s 42 -c ./lama_cpu.yml -t 7200;
during fold 2 calculation an error should appear;
if you run python ./lama_cpu.py -p ./data/ -k otto -f 2 -n 4 -s 42 -c ./lama_cpu.yml -t 7200 you should see a normal program termination on a different dataset.

Expected behavior

I expect the sf-crime dataset to finish successfully just like otto.

Additional context

You can make the error disappear if you change learning rate from 0.1 to 0.05. But is it a good solution?

Checklist

[x] bug description
[x] steps to reproduce
[x] expected behavior
[x] code sample / screenshots