Solves #70

About

This PR is part of a sequence of PR's with name svm training improvement $n that presents few improvements or combination of improvements as attempts to make training faster and consume less memory.

⚠️⚠️ Do not merge this PR as we first need to compare with other attempts first. ⚠️⚠️

Description

Currently, all iterations of the grid search ran during the SVM training are done concurrently. This results in multiple SVM being loaded at the same time in memory, thus consuming lots of RAM. Using a Bluebird mapSeries this PR makes sure, only one SVM is loaded at any time during the grid search.

Unfortunately, because the node-svm binding uses a NAPI:: AsyncWorker, this fixes also reduces training speed.

Performance

On clinc150 using local lang server with dimension 100:

branch	memory used (mb)	time to train (s)
master	~800	101
this	~700	250

On John Doe* using remote lang server https://lang-01.botpress.io

branch	memory used (gb)	time to train (min)
master	~40	20
this	< 2	38

John Doe is an internally-known really big bot

botpress / nlu

svm training improvement 1 #88

About

Description

Performance