facebook / duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Other
4.05k stars 723 forks source link

'tonight 815' doesn't resolve correctly #592

Closed chessai closed 3 years ago

chessai commented 3 years ago

Some debugging info: Without the "at", but with a colon works:

> debug (makeLocale EN Nothing) "tonight 8:15" [Seal Time]
intersect (tonight 8:15)
-- tonight (tonight)
-- -- regex (tonight)
-- hh:mm (8:15)
-- -- regex (8:15)
[
    {
        "body": "tonight 8:15",
        "dim": "time",
        "end": 12,
        "latent": false,
        "start": 0,
        "value": {
            "grain": "minute",
            "type": "value",
            "value": "2013-02-12T20:15:00.000-02:00",
            "values": [
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2013-02-12T20:15:00.000-02:00"
                }
            ]
        }
    }
]

Without the colon returns an interval because we only have a partial match on "tonight"

> debug (makeLocale EN Nothing) "tonight 815" [Seal Time]
tonight (tonight)
-- regex (tonight)
[
    {
        "body": "tonight",
        "dim": "time",
        "end": 7,
        "latent": false,
        "start": 0,
        "value": {
            "from": {
                "grain": "hour",
                "value": "2013-02-12T18:00:00.000-02:00"
            },
            "to": {
                "grain": "hour",
                "value": "2013-02-13T00:00:00.000-02:00"
            },
            "type": "interval",
            "values": [
                {
                    "from": {
                        "grain": "hour",
                        "value": "2013-02-12T18:00:00.000-02:00"
                    },
                    "to": {
                        "grain": "hour",
                        "value": "2013-02-13T00:00:00.000-02:00"
                    },
                    "type": "interval"
                }
            ]
        }
    }
]

However, "815" can't be recognised as time without Options{withLatent=True}, so if we enable latent:

> debugCustom testContext Options{withLatent=True} "tonight 815" [Seal Time]
tonight (tonight)
-- regex (tonight)
hhmm (latent) (815)
-- regex (815)
[
    {
        "body": "tonight",
        "dim": "time",
        "end": 7,
        "latent": false,
        "start": 0,
        "value": {
            "from": {
                "grain": "hour",
                "value": "2013-02-12T18:00:00.000-02:00"
            },
            "to": {
                "grain": "hour",
                "value": "2013-02-13T00:00:00.000-02:00"
            },
            "type": "interval",
            "values": [
                {
                    "from": {
                        "grain": "hour",
                        "value": "2013-02-12T18:00:00.000-02:00"
                    },
                    "to": {
                        "grain": "hour",
                        "value": "2013-02-13T00:00:00.000-02:00"
                    },
                    "type": "interval"
                }
            ]
        }
    },
    {
        "body": "815",
        "dim": "time",
        "end": 11,
        "latent": true,
        "start": 8,
        "value": {
            "grain": "minute",
            "type": "value",
            "value": "2013-02-12T08:15:00.000-02:00",
            "values": [
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2013-02-12T08:15:00.000-02:00"
                },
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2013-02-12T20:15:00.000-02:00"
                },
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2013-02-13T08:15:00.000-02:00"
                }
            ]
        }
    }
]

We get a full match, but they don't intersect, so the result is still wrong.

Note that "tonight at 815" is recognised if latent is enabled.

This probably just needs us to make the 'at' optional in the at rule.