Open salva opened 3 years ago
Hi @salva
There's currently a proposal for adding PREFER_TIME_OF_DAY
to the settings. If you are interested and or have ideas you can write a comment in that issue (https://github.com/scrapinghub/dateparser/issues/802).
On the other hand, what you want to achieve can be easily performed by using the replace()
method of the datetime
object:
older_than = dateparser.parse(args.older_than)
if older than:
older_than = older_than.replace(hour=0, minute=0, second=0, microsecond=0)
Let me know if this fixes your issue :slightly_smiling_face:
On the other hand, what you want to achieve can be easily performed by using the
replace()
method of thedatetime
object: ... Let me know if this fixes your issue
No, not really, because I want the date time rounded only when that part of the date is not explicitly given.
For instance, rounding to the past:
yesterday ==> 2021-05-02 00:00:00
yesterday 13:30 ==> 2021-05-02 13:30:00 # Time is given, so it is not rounded
January ==> 2021-01-01 00:00:00
2021 ==> 2021-01-01 00:00:00
Rounding to the future:
yesterday ==> 2021-05-02 03:59:59.99999
yesterday 13:30 ==> 2021-05-02 13:30:59.99999 # Seconds are not given, so they are rounded up
January ==> 2021-01-31 24:59:59.99999
2021 ==> 2021-12-31 24:59:59.99999 # the last month, day, hour, minute and second is picked
In the end, what I am asking for is a new setting equivalent to PREFER_MONTH_OF_YEAR
+PREFER_DAY_OF_MONTH
+PREFER_HOUR_OF_DAY
+PREFER_MINUTE_OF_HOUR
+PREFER_SECOND_OF_MINUTE
(supposing the PREFER_DAY_OF_MONTH
concept is extended to all those date components).
Hi @salva
Ok, I see.
In that case, you can get the desired result for the first query using the DateDataParser.get_date_data()
, which is similar to use the dateparser.parse()
method but it provides more information:
>>> ddp = DateDataParser(settings={'RETURN_TIME_AS_PERIOD': True})
>>> ddp.get_date_data('yesterday 13:30')
DateData(date_obj=datetime.datetime(2021, 5, 2, 13, 30), period='time', locale='en')
>>> ddp.get_date_data('yesterday')
DateData(date_obj=datetime.datetime(2021, 5, 2, 16, 17, 14, 313943), period='day', locale='en')
So your code would be something like:
ddp = DateDataParser(settings={'RETURN_TIME_AS_PERIOD': True})
older_than_data = ddp.get_date_data(args.older_than)
if older_than_data:
older_than = older_than_data.date_obj
if older_than_data.period != 'time':
# fix time when is not especified
older_than = older_than.replace(hour=0, minute=0, second=0, microsecond=0)
You can use the same way to get the period
and perform the operations you need, but you can only know the last period
(example: ddp.get_date_data('January')
will have 'month'
as period, so you won't know if the year was missing or not).
You can also check the RELATIVE_BASE
setting to see if it works for you.
If all of these solutions don't work for you we could try to give a change to your proposal and see how it could be implemented :)
Hi, I am working on a program where the user can select some files using
newer-than
andolder-than
predicates as for instance:The issue I have is that
dataparser
uses the current time when that part is missing. So, for instance, that--older-than=yesterday
above, becomes something like2021-05-02 14:42:36.990992
, but what I need is2021-05-02 00:00:00
.On the other hand, I would like
--newer-than=yesterday
to become2021-05-02 23:59:59.99999
. And in a similar fashion, I would like--newer-than=2020
to become2020-12-31 23:59:59.99999
.The idea is very similar to that of
PREFER_DAY_OF_MONTH
but for all the parts, not just months.