axa-group / nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
MIT License
6.25k stars 620 forks source link

NluManager is not a constructor #395

Closed SpeedyCraftah closed 1 year ago

SpeedyCraftah commented 4 years ago

Describe the bug NluManager is documented to work, but does not exist in the actual package itself

To Reproduce Steps to reproduce the behavior:

  1. Attempt to initialise the NluManager class through const { NluManager } = require("node-nlp"); new NluManager(); => NluManager is not a constructor

Expected behavior Well it should be a constructor

Screenshots image

Desktop (please complete the following information):

Onyoursix commented 4 years ago

I get this same issue as well. I also get it when I try to use BayesNLU running the test code. I downgraded to 3.10.2 and it seems to work fine in that version.

OS: Ubuntu 19.10 NodeJS: 10.19.0

SpeedyCraftah commented 4 years ago

I think the developer just made docs for features that arent yet implemented, but are possibly in plans. Also the lib is very very very heavy and resource intensive, uses so much ram, hogs cpu completely when processing response, I'm moving onto tensorflow js or some online based nlu

jesus-seijas-sp commented 4 years ago

Hello @SpeedyCraftah, is just the opposite. You are trying to use features as in version 3, that was monolithic,and not as in version 4 that is splitted into small packages. You have a guide for backend here: https://github.com/axa-group/nlp.js/blob/master/docs/v4/quickstart.md And a guide for frontend here: https://github.com/axa-group/nlp.js/blob/master/docs/v4/webandreact.md

About the heavyness and resource intensive, perhaps you can explain better your problems. The size of the library compiled for frontend haves a size of 111KB, so I don't think that is heavy. About resource intensivenes, right now this is the only NLP in the market, as far as I know, able to work with more than 2000 intents, you have an example in the branch of the version 3 training 10K intents: https://github.com/axa-group/nlp.js/tree/v3.x/examples/squad But that takes aprox 15 minutes in my development computer... But common size bots train in less than 1 seconds and gives an answer (process) in less than 30ms.

The bots that we have in production don't use more than 400MB of RAM (we are talking about a docker, so that number includes the Operative System, Node, ...), even with a usual bot of more than 500 intents with an average of 10 utterances per intent, training in less than 1 second.

One potential problem of performance (and also memory usage) can be because of the use of Microsoft Recognizers Text as the builtin entity system. Also, it seems to have a bug in french when used in Windows environment because of the size of the regular expressions. This can be solve in two ways:

  1. If you don't need builtins, then in the settings put that ner builtings is an empty array:
    const manager = new NlpManager({
    languages: [locale],
    nlu: { useNoneFeature: true },
    ner: { builtins: [] },
    });
  2. If you need builtin entities, then you can deploy a Duckling instance, is also dockerized.

If you can explain better your problem with resource intensive with the example of your code, I can investigate.

Onyoursix commented 4 years ago

@jesus-seijas-sp thanks for those links, I'm very new to all this but I'll read through the back-end and front-end docs.

SpeedyCraftah commented 4 years ago

My problem was implementing this library with something that doesn't only do NLP, when processing the library would literally bottleneck everything and nothing would process until it was finished

jesus-seijas-sp commented 4 years ago

Well... Is up to you to move to any other technology. From my size, I can share you the results of number of processes per seconds testing the same corpus in english (50 intents, 5 utterances per intent to train, you can find the corpus here: https://github.com/axa-group/nlp.js/blob/v3.x/examples/benchmark/corpus50.json )

Also, take into account the TTL with the other providers, in NLP.js it's about 30ms to resolve, but when you're accessing to a remote endpoint usually is more than 500ms per request with a usual NLP.

The part that could be blocking is the training, if you have a big corpus. But for this you can use childprocess, you have the example of child process usage for training in the app: https://github.com/axa-group/nlp.js-app/blob/master/server/trainers/nlpjs-trainer.js#L183

So well, I invite you to check with whatever technology you consider, sincerely the time that I asked you to show the code to take a look and help, had no answer from your side, so for me is imposible to help you, because you're describing a problem that I'm not able to reproduce.

taf2 commented 4 years ago

It would be good to update the docs on the main README.md... it points here: https://github.com/axa-group/nlp.js/blob/master/docs/v3/ner-manager.md

From the main table of contents right after listing all the advantages of version 4... then it proceeds to link to version 3 docs... might resolve a bunch of confusion...

Apollon77 commented 2 years ago

Docs will be updated in #1171

aigloss commented 1 year ago

Closing due to inactivity. Please, re-open if you think the topic is still alive.