Closed benag closed 1 year ago
Hello,
Hebrew has no native support, even with that it should work.
const { NlpManager } = require('node-nlp');
(async() => {
const manager = new NlpManager({ languages: ['he'] });
manager.addDocument('he', 'האם אני מכוסה', 'coverage.cover');
manager.addDocument('he', 'למה אני מכוסה', 'coverage.cover');
await manager.train();
const response = await manager.process('he', 'מכוסה');
console.log(response);
})();
This is working.
On the other hand, I did a corpus using google translate, you can find the corpus here: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/corpora/corpus-he.json
And the code for measuring here: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/hebrew/12-benchmark.js
As hebrew has no native support in nlp.js the results does not get to 80% or upper of Precision, those are the results of the measurement:
So a 71.48% of Precision.
Thanks, is this code use BERT ?
No, this is not using BERT, is tokenizing by word. If you want to use BERT is more complex, and you need a VOCAB file from an hebrew training, and you have two options: connect to a BERT server done in python, or use the included BERT tokenizer in javascript, but is not documented right now.
Closing due to inactivity. Please, re-open if you think the topic is still alive.
I don't manage to use the framework to identify intents in hebrew, here is my code:
this.manager.addDocument('he', 'האם אני מכוסה', 'coverage.cover'); this.manager.addDocument('he', 'למה אני מכוסה', 'coverage.cover');
await this.manager.train(); this.manager.save(); const response = await this.manager.process('he', text);
Can you kindly show me how this can be done?
Thank you,