AmyOlex / Chrono

Parsing time normalizations from text.
GNU General Public License v3.0
15 stars 4 forks source link

Part-of-Day #52

Open maffeyl opened 6 years ago

maffeyl commented 6 years ago

Got most of the parts-of-day but missing two instances where "am" and "pm" are used as shorthand. "Instructions: 50-mg am, 25-mg pm (25-mg pills)."

AmyOlex commented 6 years ago

This makes sense as AM and PM are being used as synonyms for morning and evening. I think we may need to create some additional rules that if we find an AM or PM without an associated hour then we look for signs to see if it is being used as a synonym. I wonder what the POS tag for these are? Maybe we could use that as a guidance. We have to exclude instances that are not temporal references like "I am", but instances like "the am" refer to the morning. "Am" is always a verb, so my thought would be to pull out the POS tag and if it is not a verb and is a lone instance of "am" and not part of another word, then we can classify it as a part of day. However, we would also need to watch out for the radio "AM". If we had more examples then ML may help but I don't think we have enough examples.