snipsco / snips-nlu

Snips Python library to extract meaning from text
https://snips-nlu.readthedocs.io
Apache License 2.0
3.9k stars 512 forks source link

Number not identified as datetime #852

Open ffos opened 5 years ago

ffos commented 5 years ago

This is the opposite of /issues/676. In this case, we want the number to be identified as a date, but it isn't in some cases

dataset.yaml:

---
type: intent
name: find_time
slots:
  - {name: time, entity: snips/datetime}
  - {name: service_open_attribute, entity: service_open_attribute}
utterances:
  - 'at [time](on wednesday)'
  - 'by [time](on wednesday)'
  - 'on [time](on wednesday)'
  - 'for [time](on wednesday)'
  - 'that''s for [time](on wednesday)'
  - '[time](on wednesday)'
  - 'before [time](on wednesday)'
  - 'after [time](on wednesday)'
  - 'around [time](on wednesday)'
  - 'sometime around [time](on wednesday)'
  - 'anytime around [time](on wednesday)'
  - 'anytime [time](on wednesday)'
  - 'sometime [time](on wednesday)'
  - '[time](on wednesday)'
  - 'I was looking for [time](today)'
  - 'no, at [time](on wednesday)'
  - 'no, on [time](on wednesday)'
  - 'no, before [time](on wednesday)'
  - 'no, after [time](on wednesday)'
  - 'no, around [time](on wednesday)'
  - 'no, sometime around [time](on wednesday)'
  - 'no, anytime around [time](on wednesday)'
  - 'no, anytime  [time](on wednesday)'
  - 'no, sometime  [time](on wednesday)'
  - 'open [service_open_attribute](now)'
  - 'for [service_open_attribute](now)'
  - '[service_open_attribute](now)'
  - 'something [service_open_attribute](now)'
  - 'something for [service_open_attribute](now)'
  - 'anything [service_open_attribute](now)'
  - 'anything for [service_open_attribute](now)'
  - 'something that''s open [service_open_attribute](now)'
  - 'I was looking for one that is open [service_open_attribute](now)'
  - 'no, [service_open_attribute](now)'

---
type: entity
name: service_open_attribute
automatically_extensible: false
values:
  - [OPEN_NOW, now, right now, right away, straightaway, straight away, this moment, promptly, pronto, immediately, at once, soon, any time, whenever, when ever, still open, currently open]

After model is built and run, following are the outputs: Expected output (no bug case):

{
      "input": "for 7" or "at 7",
      "intent": {
        "intentName": "find_time",
        "probability": 1.0
      },
      "slots": [
        {
          "range": {
            "start": ...,
            "end": ...
          },
          "rawValue": "7",
          "value": {
            "kind": "InstantTime",
            "value": "2019-09-12 19:00:00 +10:00",
            "grain": "Hour",
            "precision": "Exact"
          },
          "entity": "snips/datetime",
          "slotName": "time"
        }
      ]
    }

Unexpected output (bug):

{
      "input": "for 7 tomorrow",
      "intent": {
        "intentName": "find_time",
        "probability": 0.28509611603847185
      },
      "slots": [
        {
          "range": {
            "start": 6,
            "end": 14
          },
          "rawValue": "tomorrow",
          "value": {
            "kind": "InstantTime",
            "value": "2019-09-13 00:00:00 +10:00",
            "grain": "Day",
            "precision": "Exact"
          },
          "entity": "snips/datetime",
          "slotName": "time"
        }
      ]
    }

Expected time:"2019-09-13 07:00:00 +10:00"

Unexpected output (bug),

{
      "input": "for 7 on wednesday",
      "intent": {
        "intentName": "find_time",
        "probability": 0.14985857631839544
      },
      "slots": [
        {
          "range": {
            "start": 6,
            "end": 18
          },
          "rawValue": "on wednesday",
          "value": {
            "kind": "InstantTime",
            "value": "2019-09-18 00:00:00 +10:00",
            "grain": "Day",
            "precision": "Exact"
          },
          "entity": "snips/datetime",
          "slotName": "time"
        }
      ]
    }

Expected time: "2019-09-18 07:00:00 +10:00"

Environment:

adrienball commented 5 years ago

Hey @ffos , Indeed, locutions such as "for 7 tomorrow" or "for 7 on wednesday" are not resolved by our datetime parser as they are slightly too ambiguous. The following alternatives work though:

It is always a bit risky to extend the datetime grammar as it may have unexpected side effects. For instance, if you take the sentence "find me a table for 4 on wednesday", it is not clear wether "4" refers to a time or a number of people.

Do you think the alternatives provided above could be good enough ? Cheers

ffos commented 5 years ago

Hi Adrien - First, many thanks to your team for building and releasing snips-nlu; it's an impressive tech that has moved us away from AWS Lex.

Re: this issue, thanks for your comment and clarifying how it can be ambiguous. I can see your argument. What I left out in the issue (and apology) is that we are also applying a single filter when making the request to snips. In each case we are making the request using a known context in the conversation of finding time as follows:

engine.parse(input_text, ['find_time'])

So, we have a situation where "for 7" (tested in version 0.20.1) unambiguously resolves to datetime, whereas "for 7 tomorrow" doesn't. We can't accomodate the alternatives though because the user of the system could naturally reply with utterances starting with "for " without "am/pm".

At the moment, we are thinking of falling back to a different intent definition of looking specifically for before <day/date>, which we have a bit of feeling would work with some logic on our end, but have yet to try it out.