axa-group / nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
MIT License
6.28k stars 621 forks source link

Having difficulty extracting entities using the default extractor #860

Closed notquiterussell closed 1 year ago

notquiterussell commented 3 years ago

Describe the bug Hi I've recently switched from constructing an NLP directly in JavaScript to using the 'dockStart' approach. Since doing this I'm finding that the default entity extraction isn't happening.

I'm sure I've simply made a mistake with the config, but for the life of me I cannot see what I've done.

To Reproduce

const { dockStart } = require('@nlpjs/basic');

describe('Tests of entity extraction', () => {
  it('Can extract a number', async () => {
    const dock = await dockStart({
      settings: {
        nlp: {
          languages: ['en'],
          forceNER: true,
          corpora: [
            {
              name: 'Count',
              locale: 'en',
              data: [
                {
                  intent: 'action/count',
                  utterances: ['This is'],
                  answers: ['You said'],
                },
              ],
            },
          ],
        },
        ner: {
          log: false,
        },
      },
      use: ['Basic', 'LangEn'],
    });
    const nlp = dock.get('nlp');
    await nlp.train();

    const result = await nlp.process('This is 12');
    console.log(JSON.stringify(result, null, 2));
    expect(result.intent).toEqual('action/count');
    expect(result.answer).toEqual('You said');
    expect(result.entities).toHaveLength(1);
  });
});

Expected behavior I had expected the above test to extract the number 12 from the utterance.

Screenshots

Desktop (please complete the following information):

Additional context

wparad commented 3 years ago

It automatically works out of the box for most entities if you do something like this: https://github.com/axa-group/nlp.js/blob/master/packages/builtin-compromise/README.md#L1

samdeesh commented 2 years ago

Can we have some more documentation added for entities?

Example for entities(pardon me if there could be more entities here):

String: How are you doing today? Entities: You, today

I sent you money last evening around 5PM entities: I, You, Money, last evening, 5 PM

Your account XX1234 has been credited with $10 at 10 am 20 April 2022 from Netflix refund and the total balance is USD 110 entities: XX1234, $10, 10 am 20 April 2022, Netflix, USD 110

Apollon77 commented 2 years ago

I added some documentation in my PR https://github.com/axa-group/nlp.js/pull/1171/files#diff-6e3a555ca9a8cfee3e84d492bc5db05702d7d2a6508afc7098a6ce9c60cf46da

but yes each extractor to be used needs to be registered manually. A configuration option would be great

aigloss commented 1 year ago

Entity extraction is a pretty expensive mechanism, so it's probably better to let the user define explicitly which extractors he wants to use. Anyway... I'm closing this issue due to inactivity. Please, re-open if you think the topic is still alive.