Closed electrum closed 13 years ago
This is a tricky one. It should be possible to create a custom format which allows other text around it. But you might want to look at Chronic or Nickel https://github.com/lzell/nickel.
The limitation for Timeliness is that it uses regexps and removes some regex specific characters from a format before compiling the string into a regexp. This makes it a little tricky to navigate around this to allow open ended strings with junk in it.
Timeliness is more about control and speed than freedom to parse any string.
Thanks for the link to Nickel -- I hadn't seen that one. What I ended up doing was combining regexes with strptime
:
'\d{1,2}/\d{1,2}/\d{2}' => '%m/%d/%y',
'\d{1,2}/\d{1,2}/\d{4}' => '%m/%d/%Y',
'[a-z]{3,} \d{1,2}, \d{4}' => '%b %d, %Y',
You need the regexes because strptime
will incorrectly parse strings that don't match the format:
ruby > Date.strptime('01/02/03', '%m/%d/%Y')
=> #<Date: 0003-01-02 (3444309/2,0,2299161)>
ruby > Date.strptime('01/02/2003', '%m/%d/%y')
=> #<Date: 2020-01-02 (4917701/2,0,2299161)>
Fortunately, it's laxness causes it to ignore the extra junk at the end.
I played with Timeliness more to get this to work and found this is possible
Timeliness.parse('March 12, 2011 asdf', :type => :date, :format => 'mmm d, yyyy [a-zA-Z0-9 ]*')
It's very fragile however. You would need to add more characters which may be included in the string.
I'd like to parse a date from the start of a string, ignoring invalid characters after a valid date. For example, using
mmm d, yyyy
, parse the following:The extra characters " is a Monday" would be ignored.