axa-group / nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
MIT License
6.28k stars 621 forks source link

Contextdata doesn't get replaced in answer #817

Closed Reilaen closed 3 years ago

Reilaen commented 3 years ago

Hello,

i am currently trying to create a chatbot and adding some contextdata to it, like in this example: https://github.com/axa-group/nlp.js/tree/master/examples/14-ner-corpus

But at the end when running my code against "what is the real name of spiderman?" the {{ hero }} part gets replaced, but the {{ _data[entities.hero.option].realName }} doesn't.

My current code looks like this:

import { NlpManager, ConversationContext } from 'node-nlp'
const manager = new NlpManager({
    languages: ['en'],
    forceNER: true,
    autoSave: false,
    nlu: { useNoneFeature: true }
})
const context = new ConversationContext()
manager.addCorpora('./corpus.json')
await manager.train()
const response = await manager.process(
    'en',
    'what is the real name of spiderman?',
    context
)
console.log(response)

The corpara i use are those linked in the example above: https://github.com/axa-group/nlp.js/blob/master/examples/14-ner-corpus/corpus.json https://github.com/axa-group/nlp.js/blob/master/examples/14-ner-corpus/heros.json

I am not sure if this is a bug or a mistake on my behalf and i would be grateful if somebody could clarify it for me.

jesus-seijas-sp commented 3 years ago

Hello!

Mmmm... I never thougth about using this context information with node-nlp, it was more intended for v4 way to do the things... But is possible. Changes to do:

  1. Remove the ConversationContext part, this is now managed automatically by the NLP
  2. Instead of providing locale and utterance to the process, provide an object that contains locale and utterance. Why? That's a good question... I think that I have a little bit the code inside process, but basically in the past I differenciated between receiving an string and an object, and when it's an object acts differently because is intended to beused inside a pipeline with connectors and so on... But I have to rewrite that part so when passing strings it also calculates correctly the context.
const { NlpManager } = require('node-nlp');

(async () => {
  const manager = new NlpManager({
    languages: ['en'],
    forceNER: true,
    autoSave: false,
    nlu: { useNoneFeature: true }
  })
  manager.addCorpora('./corpus.json')
  await manager.train()
  const response = await manager.process({ locale: 'en', utterance: 'what is the real name of spiderman?'});
  console.log(response)
})();
Reilaen commented 3 years ago

Hello Jesus,

this worked great thanks alot! The reason i use it this way is that i followed the installation guide on https://github.com/axa-group/nlp.js#installation

I realized it is different to the quickstart guide: https://github.com/axa-group/nlp.js/blob/master/docs/v4/quickstart.md and wasn't quite sure what the difference is. So i sticked with the version on the front page.

Should i better follow the v4 Quickstart way?

jesus-seijas-sp commented 3 years ago

Well, node-nlp is still there for compatibility with people that used v3 of nlp.js and also because is more easy to understand as it's monolithic and with only one purpose. The v4 quickstart is better, and the possibilities of use are less limited that with v3, but it also requires a bigger learning curve. But I definitely recommends v4.

Reilaen commented 3 years ago

Thanks for the quick reply! I stumbled on a problem with the above solution. It seems that context doesn't quite work like one would expect it to.

On using follow up questions like this: "who is spiderman?"->"where does he live?" it leads to a: "TypeError: Cannot read property 'option' of undefined". It seems like it thinks "he" is a option of contextdata but up on trying to use it fails of course.

jesus-seijas-sp commented 3 years ago

This is because in the quickstart, the conversation with the NLP is done through a connector: ConsoleConnector or DirectlineConnector. Connectors have the concept of conversationId to identify the conversation, in the case of DirectlineConnector you can open several browsers with the bot and each one will have a differente conversation, because the conversationId is different.

In your code there is nothing telling the NLP how to identify the conversation, so it has not way to save/load the context based on an identifier. But try this:

const { NlpManager } = require('node-nlp');

(async () => {
  const manager = new NlpManager({
    languages: ['en'],
    forceNER: true,
    autoSave: false,
    nlu: { useNoneFeature: true }
  })
  manager.addCorpora('./corpus.json')
  await manager.train()
  const activity = {
    conversation: {
      id: 'a1'
    }
  }
  let response = await manager.process({ locale: 'en', utterance: 'what is the real name of spiderman?', activity });
  console.log(response.answer)
  response = await manager.process({ locale: 'en', utterance: 'and where he lives?', activity });
  console.log(response.answer)
})();

This activity in the input is giving it a conversation identifier that the context manager can retrieve to identify the conversation. This is something that usually you don't bother about because you are using the NLP through a connector that already haves this conversation id.

Reilaen commented 3 years ago

Great! Thanks alot for the thorough explanations i really learned a lot from you! That mostly solves the issue, except when "and where he lives?" is asked as first question. Then it still crashes with the same error.

jesus-seijas-sp commented 3 years ago

Yeah... that's normal... think that you're using a variable that is not defined (entities.hero does not exists). In this case... well... try catch? Nah, just kidding. Answers can be conditionals, I mean, you can add the text for the answer and also a condition that must be true so this answer is chosen. In this case, if entities.hero is undefined.... well, better take a look:

      "answers": [
        { "answer": "{{ hero }} lives at {{ _data[entities.hero.option].city }}", "opts": "entities.hero !== undefined" },
        { "answer": "You have to specify a hero", "opts": "entities.hero === undefined" }
      ]

This is how the full corpus looks like:

{
  "name": "Corpus with entities",
  "locale": "en-US",
  "contextData": "./heros.json",
  "data": [
    {
      "intent": "hero.realname",
      "utterances": [
        "what is the real name of @hero"
      ],
      "answers": [
        { "answer": "The real name of {{ hero }} is {{ _data[entities.hero.option].realName }}", "opts": "entities.hero !== undefined" },
        { "answer": "You have to specify a hero", "opts": "entities.hero === undefined" }
      ]
    },
    {
      "intent": "hero.city",
      "utterances": [
        "where @hero lives?",
        "what's the city of @hero?"
      ],
      "answers": [
        { "answer": "{{ hero }} lives at {{ _data[entities.hero.option].city }}", "opts": "entities.hero !== undefined" },
        { "answer": "You have to specify a hero", "opts": "entities.hero === undefined" }
      ]
    }
  ],
  "entities": {
    "hero": {
      "options": {
        "spiderman": ["spiderman", "spider-man"],
        "ironman": ["ironman", "iron-man"],
        "thor": ["thor"]
      }
    },
    "email": "/\\b(\\w[-._\\w]*\\w@\\w[-._\\w]*\\w\\.\\w{2,3})\\b/gi"
  }
}
jesus-seijas-sp commented 3 years ago

Code used at index: image

And my test result: image

Reilaen commented 3 years ago

I see, i thought that it would recognize itself that "he" isn't part of the entities.hero and would simply ignore it. It works perfectly fine now. Thank you for teaching me more about this awesome library and helping me solve this issue.