Open YogevHend opened 8 years ago
@acheleo The fuzzy
switch is often misunderstood (probably mis-named). It does not refer to an interpretation of "date-like", it refers to parsing strings which contain non-date elements, or elements that can be ignored. I will clarify this in the documentation, which is currently a bit ambiguous:
- fuzzy – Whether to allow fuzzy parsing, allowing for string like “Today is January 1, 2047 at 8:21:00AM”.
I think something like what you describe could be useful, but I think parse
is already complicated enough as is. I've mentioned on a few other issues and PRs, the solution to #123, #125 and #214 is to change the way parser options are specified, either by breaking out the individual parsing functions so they can be extended and/or providing a robust way to change them. Depending on the form that actually takes, it might be a simple matter to implement the sort of soft-string matching you are talking about yourself.
Alternatively, I could see adding another function to the parser
module that provides some similar functionality, maybe it could be something like deltaparser
, which parses time delta specifications to relativedelta
or timedelta
objects (probably relativedelta
would work best for something like you are describing, if you want people to be able to parse "a year ago" and "a month ago" - since year and month are fundamentally inexact units).
I definitely agree that fuzzy does not mean what I suggested in any way. I referred to it since I've seen that it does follow the same logic when trying to parse
'today'
It could be a good idea to have an expandable part to dateutil, that is able to cope with soft logic parsing. I've also noticed that the parser indeed does not cope with the word 'ago' correctly, which by your explanation is perfectly understandable.
Right now I am doing my own replacement of words before passing the string forward to dateutil parser.
I think you misunderstand the "today" situation. The default argument to the "default" argument is the current date (frankly, a bit of an unfortunate choice in my opinion). So if you pass it "Today at 5PM", it will give you the current date at 5 PM, true, but if you pass it "Tomorrow at 5PM" or even just "5PM", it will also give you the current date at 5PM.
If you use fuzzy_with_tokens
, you'll see which parts of the string the parser ignored. "Today" will be in that list.
I see, yeah, that is pretty confusing. Is it possible to maybe add some word based parser as a configurable addition? Just like a fuzzy switch.
Like I said, I don't really think it's necessary to add that complexity to the existing parser, which is really there to parse reasonable date formats. I'm certainly not against the idea, but I'd rather work on making the parser less monolithic and more agile. It remains to be seen whether, as a consequence of those changes, something like what you are looking for could be easier to implement.
That said, I would not be surprised if something like this already exists.
dateutil.parser does not support certain phrases when using the fuzzy switch:
and so on and so forth. Is it possible to add this to the supported strings the parser is able to go through? I believe a finite list containing the common time phrases shouldn't be too long and can provide a lot of coverage.