winkjs / wink-eng-lite-web-model

English lite language model for Web Browsers
MIT License
11 stars 8 forks source link

Error when creating more than one instance of nlp object #2

Closed pimpale closed 3 years ago

pimpale commented 3 years ago

Creating a second instance of the nlp object fails with the following error:
Uncaught TypeError: clusters[k].split is not a function

Example of code causing the problem:

let nlp1 = winkNlp(model);
nlp1.readDoc('test 1').printTokens();

let nlp2 = winkNlp(model); // fails here
nlp2.readDoc('test 2').printTokens();

Here is a demonstration of this behavior: https://codepen.io/pimpale-the-sans/pen/RwVyeGX

My use case for having multiple nlp objects is to to learn different custom entities on each one.

The source of the problem is wink-eng-lite-web-model/dist/read-core-model.js:

const model = require("./languages/cur/models/eng-core-web-model.json");
var readModel = function() {
  var
    lexiconAsBuffer,
    xpansionsAsBuffer,
    packing = model.packing,
    featuresData = model.features, // featuresData is modified later on
    pos = model.pos;
    // following lines omitted
    return model
};

This file modifies a global variable, which means that subsequent loads of the model will fail, since the data is no longer in the correct form.

Here is a working solution that prevents this:

var readModel = function() {
  var model = JSON.parse(JSON.stringify(require("./languages/cur/models/eng-core-web-model.json")));
  var
    lexiconAsBuffer,
    xpansionsAsBuffer,
    packing = model.packing,
    featuresData = model.features,
    pos = model.pos;
    // following lines omitted
    return model
};

This code snippet copies the model before changing it. I tested this version, and it seems to solve the error. It also did not noticeably slow down loading the model.

sanjayaksaxena commented 3 years ago

Thanks for highlighting the issue. We had not imagined this use case!

You are right the load time difference is not noticeable — we did a quick measurement and it takes about 30ms to 40ms more on a 2.2 GHz 6-Core Intel Core i7/16GB machine; on mobile device it may go up to 100-200ms more, which should also be acceptable.

Will fix the issue shortly and release a new patch.

sanjayaksaxena commented 3 years ago

Released version 1.2.2 with the above patch.

pimpale commented 3 years ago

Thanks!