Description
For now, using the default configuration, calling several times the fit method of SnipsNLUEngine will produce slightly different engines. In many cases, it is useful to have reproducible trainings. The NLU engine configuration contains two random seed parameters at the intent classifier level and slot filler level, which are meant to enable this.
However there are currently two issues related to this:
a bug in the scikit-learn library was causing non-deterministic behaviors (see https://github.com/scikit-learn/scikit-learn/pull/13422). This should be fixed in the next version of scikit-learn, but in the meantime the seeds in Snips NLU do not work properly.
the use of random seeds is not very friendly at the moment, as you have to set two distinct parameters in the configuration file. Instead, we could have a single high level parameter which could be either in the configuration, or directly passed to the fit method.
Description For now, using the default configuration, calling several times the
fit
method ofSnipsNLUEngine
will produce slightly different engines. In many cases, it is useful to have reproducible trainings. The NLU engine configuration contains two random seed parameters at the intent classifier level and slot filler level, which are meant to enable this.However there are currently two issues related to this:
fit
method.