modelfoxdotdev / modelfox

ModelFox makes it easy to train, deploy, and monitor machine learning models.
Other
1.46k stars 63 forks source link

panic after completing training #11

Closed tomcraven closed 2 years ago

tomcraven commented 3 years ago

tangram_cli 0.5.0

invoked with

tangram train --file test.csv --target price_60 --output output.tangram
error: panicked at 'called `Option::unwrap()` on a `None` value', crates/core/train.rs:1699:65
   0: tangram::train::train::{{closure}}
   1: std::panicking::rust_panic_with_hook
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:595:17
   2: std::panicking::begin_panic_handler::{{closure}}
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:495:13
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:141:18
   4: rust_begin_unwind
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:493:5
   5: core::panicking::panic_fmt
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:92:14
   6: core::panicking::panic
             at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:50:5
   7: tangram::train::train::{{closure}}
   8: tangram::main
   9: std::sys_common::backtrace::__rust_begin_short_backtrace
  10: main
  11: __libc_start_main
  12: _start

i've attached my test data that causes this issue, apologies it's so large (edit: maybe it's not that bad :smile: ) it was the smallest reproducible test set i could come up with

let me know if there's anything else i can provide!

test.zip

isabella commented 3 years ago

@tomcraven thank you for providing this example. It uncovered an important bug: Tangram is not properly handling linear models that fail to converge.

At the end of training, we compare each of the models using the model comparison metric and we assumed they all had a finite root mean squared error. This was not the case here because two of the models in the grid failed to converge during training.

I'm working on a fix now and I'll let you know when it is ready.

isabella commented 3 years ago

@tomcraven @nitsky made a change that filters out the models that have NaN comparison metric values. https://github.com/tangramxyz/tangram/commit/af0e48cb0f166ea43e5721c8d82a70c0dd43ffd7. We are working on a new release now and I'll let you know when it's available.

tomcraven commented 3 years ago

amazing! thank you @isabella & @nitsky

nitsky commented 2 years ago

@tomcraven this should be fixed in v0.6.0. Please re-open this issue if you are still having trouble.