Closed nogweii closed 11 years ago
No, there's no way to do this. Right now Chronic discards the tokens it doesn't care about right here. That is, any token that doesn't contain any tags will be thrown into /dev/null
.
I don't see the benefits of extracting the non tagged items myself. At least not enough to alter the way Chronic works and store them in memory. There will be a ton of caveats when doing this. For example, the word at
will be tokenized and tags will be applied to it. So if you had tomorrow I'll be at the station
you'll have I'll be the station
returned.
Here's how you could extract those values, though (see the other_words
variable):
def tokenize(text, options)
text = pre_normalize(text)
tokens = text.split(' ').map { |word| Token.new(word) }
[Repeater, Grabber, Pointer, Scalar, Ordinal, Separator, TimeZone].each do |tok|
tok.scan(tokens, options)
end
other_words = tokens.reject(&:tagged?).map(&:word) #=> ["Go", "to", "the", "doctor's"]
tokens.select { |token| token.tagged? }
end
My idea is given a string such as "Go to the doctor's tomorrow" from the UI of an application, it would be awesome if there is a way to get the string "Go to the doctor's" as well as the Time instance. Is there a way to do this already?