botpress / nlu

This repo contains every ML/NLU related code written by Botpress in the NodeJS environment. This includes the Botpress Standalone NLU Server.
23 stars 21 forks source link

fix(nlu-engine): launch svm trainings one after the other #86

Closed franklevasseur closed 3 years ago

franklevasseur commented 3 years ago

I'm kind of ashamed of this bug, as I'm 100% responsible for it, but what is done is done.

This PR should be reviewed by commits

chore(node-svm): rm all async functions of node-svm

This commit does pretty much nothing. I simply removed every async functions from the binding.

I first added those async functions to imitate the previous node svm lib we used.

At the time, I didn't really understand how the event loop worked and I thought those functions would be faster... They are not. This really just bloats up the API of the package.

We don't need to rebuild the binaries as I simply removed method, I didn't changed any code.

fix(nlu-engine): launch svm trainings one after the other

This is the fix.

Long story short, In the grid search, I was launching all grid iterations simultaneously with a Promise.all thinking it would be faster. Again, I didn't fully understand how the event loop worked at the time.

This meant, svms where all training / loading at the same time.

Bassically, every svm.train call made outside the ml toolkit should be simultaneous as these calls are evenly distributed on child threads.

On the other hand, every svm.train call made inside the ml toolkit (in child threads) don't have to be simultaneous as

fix(nlu-engine): launch intent trainings in parallel and log each ctx

For reasons stated right above, intent training has to be launched in parallel for speed purposes.