This PR is part of a sequence of PR's with name svm training improvement $n that presents few improvements or combination of improvements as attempts to make training faster and consume less memory.
⚠️⚠️ Do not merge this PR as we first need to compare with other attempts first. ⚠️⚠️
Description
Currently, all iterations of the grid search ran during the SVM training are done concurrently. This results in multiple SVM being loaded at the same time in memory, thus consuming lots of RAM. Using a Bluebird mapSeries this PR makes sure, only one SVM is loaded at any time during the grid search.
Unfortunately, because the node-svm binding uses a NAPI:: AsyncWorker, this fixes also reduces training speed.
Performance
On clinc150 using local lang server with dimension 100:
branch
memory used (mb)
time to train (s)
master
~800
101
this
~700
250
On John Doe* using remote lang server https://lang-01.botpress.io
Solves #70
About
This PR is part of a sequence of PR's with name
svm training improvement $n
that presents few improvements or combination of improvements as attempts to make training faster and consume less memory.⚠️⚠️ Do not merge this PR as we first need to compare with other attempts first. ⚠️⚠️
Description
Currently, all iterations of the grid search ran during the SVM training are done concurrently. This results in multiple SVM being loaded at the same time in memory, thus consuming lots of RAM. Using a Bluebird
mapSeries
this PR makes sure, only one SVM is loaded at any time during the grid search.Unfortunately, because the
node-svm
binding uses a NAPI:: AsyncWorker, this fixes also reduces training speed.Performance
On clinc150 using local lang server with dimension 100:
On John Doe* using remote lang server
https://lang-01.botpress.io