comtravo / ctparse

Parse natural language time expressions in python
https://www.comtravo.com
MIT License
131 stars 24 forks source link

Duration #98

Closed gabrielelanaro closed 4 years ago

gabrielelanaro commented 4 years ago

Summary:

We would like to introduce a new type called Duration to express time deltas not anchored to a specific date. Another way to interpret a duration is something that could be attached to a date to form an interval.

Date + Duration = interval

Things I don't like:

  1. Format: the format for Time and Interval is some string that specifies date, hour and some other details, what would be a good format for duration? I made up a string format that is a bit different from Time, but it may make sense to put some more thoughts into it.

  2. one night, ein ubernachtung is a duration of ~12-24 hours, with the caveat that those should cover a whole night. A simple timedelta data structure won't provide this information. We could however assumed that this irrelevant as those caveats can be quite complicated. Another example could be "2 hours in the morning". To account for this one would need to add a modifier: Duration(12 hours / night), although this seems very domain specific and will implement only if truly necessary.

Some insight into the problem is in here: https://kentonl.com/pub/ladz-acl.2014.pdf Basically the time expressions could be represented using a sort of grammar that encodes those various details.

gabrielelanaro commented 4 years ago

@sebastianmika please take another look, I took a stab at making a bunch of rules that make use of Duration. I haven't tackled the TOD part because it require some careful design and possibly a change in the way we make the annotated corpus (for example dealing with veryearlyevening, first, last, and wether a Time with TOD is just an interval).