iainbeeston / nickel

Nickel extracts date, time, and message information from naturally worded text.
MIT License
112 stars 17 forks source link

Some dates and times not being extracted #10

Open scanferla opened 10 years ago

scanferla commented 10 years ago

I think Nickel does the best work out there when it comes to date extraction, but there are some dates and times that are not being extracted:

irb(main):022:0> Nickel.parse "You have 15 minutes to complete this task"
=> message: "You have minutes to complete this task", occurrences: []

Doesn't work when "!" is added:

irb(main):023:0> Nickel.parse "Let's do it today"
=> message: "lets do it", occurrences: [#<Occurrence type: single, start_date: "20140213">]
irb(main):024:0> Nickel.parse "Let's do it today!"
=> message: "lets do it today!", occurrences: []

There is a time for "afternoon" but not for "morning", "evening" etc:

irb(main):025:0> Nickel.parse "Tomorrow afternoon"
=> message: "", occurrences: [#<Occurrence type: single, start_date: "20140214", start_time: "120000">]
irb(main):026:0> Nickel.parse "Tomorrow morning"
=> message: "morning", occurrences: [#<Occurrence type: single, start_date: "20140214">]
irb(main):033:0> Nickel.parse "Tomorrow evening"
=> message: "evening", occurrences: [#<Occurrence type: single, start_date: "20140214">]
irb(main):039:0> Nickel.parse "Meeting at 8 o'clock"
=> message: "Meeting", occurrences: []

Thanks!

iainbeeston commented 10 years ago

So there are 4 bugs here:

scanferla commented 10 years ago

Maybe you can set defaults for "morning", "afternoon" and "evening" such as 9am, 3pm and 9pm ("middle" of each period) but giving the option for this values to be changed.

I've found some other things:

irb(main):002:0> Nickel.parse "First monday of the month"
NameError: undefined local variable or method `day_of_week' for "1st mon of the month ":Nickel::NLPQuery
irb(main):007:0> Nickel.parse "lets go dancing 3 in the morning"
=> message: "lets go dancing", occurrences: []
irb(main):019:0> Nickel.parse "Lets go dancing on 07-12-2014"
=> message: "Lets go dancing", occurrences: [#<Occurrence type: daily, start_date: "20140228", interval: 1>]
irb(main):036:0> Nickel.parse "Lets go dancing every month"
=> message: "Lets go dancing every month", occurrences: []
irb(main):016:0> Nickel.parse "Every sunday until 2015"
=> message: "", occurrences: [#<Occurrence type: weekly, start_date: "20140302", start_time: "201500", interval: 1, day_of_week: 6>]

Almost the same issue as "You have 2 days" (which returns 0 occurrences). I really don't know if it would be possible or make any sense, but I guess "2 days" could be 2 days from now no matter what we have before it: have, within, in, etc... Although "For the next 2 days" returns the correct occurrence.

irb(main):022:0> Nickel.parse "Deliver it within 2 days"
=> message: "Deliver it with2 day", occurrences: [#<Occurrence type: single, start_date: "20140228">]

I know it would add complexity, but maybe in a future release we could have some kind of score for each occurrence, possibly with one or more occurrence alternatives ordered by its score.

iainbeeston commented 10 years ago

I'll look at those when I can.

First Monday of the month should no longer give an error on master thanks to @copiousfreetime (see #11). I'll release that shortly along with a big batch of refactoring that I've been working on

scanferla commented 10 years ago

For instance, "07-12-2014" here in Brazil would be 7th of December 2014 as it is always day, month and year. I don't know about other countries if it can be month/day/year and also day-month-year when changing from "/" to "-".

scanferla commented 10 years ago

Hi, I've found some more:

Gets today at 1am instead of in 5 minutes.

irb(main):001:0> Nickel.parse "This one is great. Meet me in 5 minutes"
=> message: "This is great Meet me", occurrences: [#<Occurrence type: single, start_date: "20140307", start_time: "010000">]

Today at 2am instead of in 5 minutes.

irb(main):003:0> Nickel.parse "Give me two cents in 5 minutes"
=> message: "Give me cents", occurrences: [#<Occurrence type: single, start_date: "20140307", start_time: "020000">]

Correctly in 3 days, but at 2am.

irb(main):002:0> Nickel.parse "Give me two cents in 3 days"
=> message: "Give me cents", occurrences: [#<Occurrence type: single, start_date: "20140310", start_time: "020000">]