axa-group / nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
MIT License
6.22k stars 616 forks source link

Not able to extract entities #1320

Open Jayesh-Patidar opened 1 year ago

Jayesh-Patidar commented 1 year ago

Describe the bug I want to make a chatbot from this package but it is not able to recognise the entity properly. I tried many solution but it is not working. Thanks in advance if anyone can help.

To Reproduce Steps to reproduce the behavior:

  1. Install node.js. I am using v20.0.0
  2. Install the latest version of node-nlp. I am using v4.27.0
  3. Now you can run the following code
    
    const { NlpManager } = require("node-nlp");

const manager = new NlpManager({ languages: ["en"], forceNER: true });

manager.addNamedEntityText('hero', 'spiderman', 'en')

manager.addDocument("en", "My name is %name%", "saveName"); manager.addDocument("en", "I will be traveling on %date%", "saveDate"); manager.addDocument("en", "I saw %hero%", "sawHero");

manager.addAnswer("en", "saveName", "Nice talking to you {{name}}"); manager.addAnswer("en", "saveDate", "Have safe journey on {{date}}"); manager.addAnswer("en", "sawHero", "Really you saw {{hero}}!");

(async () => { await manager.train(); manager .process("en", "My name is John") .then((response) => console.log(response.answer)) .then(() => manager.process("en", "I will be traveling on 04 June")) .then((response) => console.log(response.answer)) .then(() => manager.process("en", "I saw spiderman")) .then((response) => console.log(response.answer)); })();

4. Below are the outputs of the above code

Nice talking to you {{name}} Have safe journey on 04 June  Really you saw spiderman!


**Expected behavior**

Nice talking to you John Have safe journey on 04 June  Really you saw spiderman!

5. If I remove the line `manager.addNamedEntityText('hero', 'spiderman', 'en')` then the output will be

Nice talking to you John Have safe journey on 04 June  Really you saw {{hero}}!


**Expected behavior**

Nice talking to you John Have safe journey on 04 June  Really you saw spiderman!



So the main problem is when I don't pass the `namedEntity` it won't recognise the entity at all. I am not very sure how it is able to identity the `date`. I want to make similar functionality as for `date` in case of all the other variables. **I am not aware about the entity in text so cannot add the named entity**.

If anyone have any other approach to solve this then it will great help.
MarketingPip commented 1 year ago

Here you go @Jayesh-Patidar! This should work! Try this out.

  nlp.slotManager.addSlot('travel', 'fromCity', true, { en: 'From where you are traveling? {{fromCity}}' });
  nlp.addDocument('en', 'I want to travel from @fromCity to @toCity', 'travel')
  nlp.addNerRuleOptionTexts('en', 'fromCity', ["Israel", "USA"]);

   nlp.addNerRuleOptionTexts('en', 'toCity', ["Israel", "USA"]); 

  nlp.addAnswer('en', 'greetings.hello', 'Greetings!');
  await nlp.train();
  const response = await nlp.process('en', 'I want to travel from Israel to USA');
  console.log(response.entities[1].utteranceText);

Hope this helps 👍

ps; this is due to documentation needing a little better organizing for this project! The method of extracting slots etc - has changed over various versions of NLP.js. But this should work with your version listed in the issue.

Jayesh-Patidar commented 1 year ago

Thanks for your reply @MarketingPip but in this case too you have defined the list of fromCity and toCity. Now the nlp will look for only these two cities but what if I pass anything else. Will that work?

Apollon77 commented 1 year ago

@Jayesh-Patidar can you post the full response object of the non working call?

MarketingPip commented 1 year ago

Thanks for your reply @MarketingPip but in this case too you have defined the list of fromCity and toCity. Now the nlp will look for only these two cities but what if I pass anything else. Will that work?

You're more than welcome! My apologizes I should have added another example to help clarify cases where a deemed entity is not defined.

I am on mobile as of right now - and can't confirm but I do think / recall there being somewhere in the documentation a symbol / way to write in your corpus a "empty slot" to be filled.

You'll have to dig deeper / or when I'm on computer later I'll dig into my treasure chest of random code snippets & see if I have anything else similar to last one I provided above. 👆

Jayesh-Patidar commented 1 year ago

@Apollon77 Below are the responses for my code

Input: My name is John

{ locale: 'en',
  utterance: 'My name is John',
  settings: undefined,
  languageGuessed: false,
  localeIso2: 'en',
  language: 'English',
  nluAnswer: { classifications: [
            { intent: 'saveName', score: 1
            },
            { intent: 'sawHero', score: 0
            },
            { intent: 'saveDate', score: 0
            }
        ],
     entities: undefined,
     explanation: undefined
    },
  classifications: [
        { intent: 'saveName', score: 1
        },
        { intent: 'sawHero', score: 0
        },
        { intent: 'saveDate', score: 0
        }
    ],
  intent: 'saveName',
  score: 1,
  domain: 'default',
  sourceEntities: [],
  entities: [],
  answers: [
        { answer: 'Nice talking to you {{name}}', opts: undefined
        }
    ],
  answer: 'Nice talking to you {{name}}',
  actions: [],
  sentiment: { score: 0.375,
     numWords: 4,
     numHits: 1,
     average: 0.09375,
     type: 'senticon',
     locale: 'en',
     vote: 'positive'
    }
}

Input: I will be traveling on 04 June

{ locale: 'en',
  utterance: 'I will be traveling on 04 June',
  settings: undefined,
  languageGuessed: false,
  localeIso2: 'en',
  language: 'English',
  nluAnswer: { classifications: [
            { intent: 'saveDate', score: 0.9927472726951043
            },
            { intent: 'sawHero', score: 0.007252727304895593
            }
        ],
     entities: undefined,
     explanation: undefined
    },
  classifications: [
        { intent: 'saveDate', score: 0.9927472726951043
        },
        { intent: 'sawHero', score: 0.007252727304895593
        }
    ],
  intent: 'saveDate',
  score: 0.9927472726951043,
  domain: 'default',
  sourceEntities: [
        { start: 23,
       end: 24,
       resolution: { value: '4'
            },
       text: '04',
       typeName: 'number',
       entity: 'number'
        },
        { start: 23,
       end: 29,
       resolution: { values: [
                    { timex: 'XXXX-06-04', type: 'date', value: '2023-06-04'
                    },
                    { timex: 'XXXX-06-04', type: 'date', value: '2024-06-04'
                    }
                ]
            },
       text: '04 june',
       typeName: 'datetimeV2.date',
       entity: 'date'
        }
    ],
  entities: [
        { start: 23,
       end: 24,
       len: 2,
       accuracy: 0.95,
       sourceText: '04',
       utteranceText: '04',
       entity: 'number',
       rawEntity: 'number',
       resolution: { strValue: '4', value: 4, subtype: 'integer'
            }
        },
        { start: 23,
       end: 29,
       len: 7,
       accuracy: 0.95,
       sourceText: '04 June',
       utteranceText: '04 June',
       entity: 'date',
       rawEntity: 'datetimeV2.date',
       resolution: { type: 'interval',
          timex: 'XXXX-06-04',
          strPastValue: '2023-06-04',
          pastDate: Sun Jun 04 2023 05: 30: 00 GMT+0530 (India Standard Time),
          strFutureValue: '2024-06-04',
          futureDate: Tue Jun 04 2024 05: 30: 00 GMT+0530 (India Standard Time)
            }
        }
    ],
  answers: [
        { answer: 'Have safe journey on 04 June', opts: undefined
        }
    ],
  answer: 'Have safe journey on 04 June',
  actions: [],
  sentiment: { score: 0.813,
     numWords: 7,
     numHits: 3,
     average: 0.11614285714285713,
     type: 'senticon',
     locale: 'en',
     vote: 'positive'
    }
}

Input: I saw spiderman

{ locale: 'en',
  utterance: 'I saw spiderman',
  settings: undefined,
  languageGuessed: false,
  localeIso2: 'en',
  language: 'English',
  explanation: [
        { token: '', stem: '##exact', weight: 1
        }
    ],
  classifications: [
        { intent: 'sawHero', score: 1
        },
        { intent: 'saveName', score: 0
        },
        { intent: 'saveDate', score: 0
        }
    ],
  intent: 'sawHero',
  score: 1,
  domain: 'default',
  optionalUtterance: 'I saw %hero%',
  sourceEntities: [],
  entities: [
        { start: 6,
       end: 14,
       len: 9,
       levenshtein: 0,
       accuracy: 1,
       entity: 'hero',
       type: 'enum',
       option: 'spiderman',
       sourceText: 'spiderman',
       utteranceText: 'spiderman'
        }
    ],
  answers: [
        { answer: 'Really you saw spiderman!', opts: undefined
        }
    ],
  answer: 'Really you saw spiderman!',
  actions: [],
  sentiment: { score: 0.375,
     numWords: 3,
     numHits: 1,
     average: 0.125,
     type: 'senticon',
     locale: 'en',
     vote: 'positive'
    }
}

You can clearly see when I passed input as My name is John then it is not able to extract entities. I want to extract the name John from then text and return it in response in place of {{name}}.

Thank you

MarketingPip commented 1 year ago

@Jayesh-Patidar - here you go! A full working example of what you need.

import { containerBootstrap } from "https://cdn.skypack.dev/@nlpjs/core@4.26.1";
import { Nlp } from "https://cdn.skypack.dev/@nlpjs/nlp@4.26.1";
import { LangEn } from "https://cdn.skypack.dev/@nlpjs/lang-en-min@4.26.1";

(async () => {
  const container = await containerBootstrap();
  container.use(Nlp);
  container.use(LangEn);
  const nlp = container.get('nlp');
  nlp.settings.autoSave = false;
  nlp.addLanguage('en');
  nlp.slotManager.addSlot('travel', 'fromCity', true);
  nlp.addDocument('en', 'I want to travel from %fromCity% to @toCity', 'travel')

    nlp.addNerBetweenLastCondition('en', 'fromCity', 'from', 'to');
  nlp.addNerAfterLastCondition('en', 'fromCity', 'from');
  nlp.addNerBetweenLastCondition('en', 'toCity', 'to', 'from');
  nlp.addNerAfterLastCondition('en', 'toCity', 'to');

  nlp.slotManager.addSlot('travel', 'fromCity', true);
  nlp.slotManager.addSlot('travel', 'toCity', true);

  await nlp.train();
  const response = await nlp.process('en', 'I want to travel from here to you go bro!');
  console.log(response.entities[0].utteranceText); // Outputs: here
   console.log(response.entities[1].utteranceText); // Outputs: you go bro!
})();

Hope this helps.

Ps; you can use Regex matches for extracting slots. But matching anything ie (.*) - gets the library acting funny.

Jayesh-Patidar commented 1 year ago

@MarketingPip Thanks for your reply. And sorry for being late

I tried your solution but this time it is working fine and extracting the fromCity and toCity entities but now suppose we have an answer for this intent which has these entities then it is not loading the entities in the answer.

const { Nlp } = require('@nlpjs/nlp');
const { LangEn } = require('@nlpjs/lang-en-min');

(async () => {
    const container = await containerBootstrap();
    container.use(Nlp);
    container.use(LangEn);
    const nlp = container.get('nlp');
    nlp.settings.autoSave = false;
    nlp.addLanguage('en');
    nlp.slotManager.addSlot('travel', 'fromCity', true);
    nlp.addDocument('en', 'I want to travel from %fromCity% to @toCity', 'travel');

    nlp.addNerBetweenLastCondition('en', 'fromCity', 'from', 'to');
    nlp.addNerAfterLastCondition('en', 'fromCity', 'from');
    nlp.addNerBetweenLastCondition('en', 'toCity', 'to', 'from');
    nlp.addNerAfterLastCondition('en', 'toCity', 'to');

    nlp.slotManager.addSlot('travel', 'fromCity', true);
    nlp.slotManager.addSlot('travel', 'toCity', true);

    nlp.addAnswer('en', 'travel', 'Happy journey {{fromCity}} to {{toCity}}');

    await nlp.train();
    const response = await nlp.process('en', 'I want to travel from here to you go bro!');
    console.log(response.entities[0].utteranceText); // Outputs: here
    console.log(response.entities[1].utteranceText); // Outputs: you go bro!
    console.log(response.answer); // Happy journey {{fromCity}} to {{toCity}}
})();

Earlier the syntax of adding answer was working fine for named entities.

Can you please help me out here?

MarketingPip commented 1 year ago

For some reason I can't seem to get this working on my end. Tho I would try using some regex matching etc.. and replacing (tho this is inconvenient) - might be a work around until you get a answer from the maintainers of this repo. ie: @axa-group

/// Create a for loop with all your entities. 

let entities = response.entities
let replacedText = response.answer

for (let item in entities){
// Regex for example {{.*}}
  replacedText = replacedText.replace(regex_for_your_entities, "entity"); // Happy journey {{fromCity}} to {{toCity}}
}

Adjust that code to work with your needs and the proper regex!

Wish I could help you out more but keep in mind I am not a contributor nor maintainer of this project. Hopefully we can get some more input from them.

Maybe a tag / mention will help here - @jesus-seijas-sp

Best of luck tho & have a great weekend @Jayesh-Patidar. ✌️

Jayesh-Patidar commented 1 year ago

Thanks for your help @MarketingPip. Waiting to listen from someone meanwhile I will try your solution.