zopefoundation / Products.ZCatalog

Zope's indexing and search solution.
Other
5 stars 22 forks source link

DateIndex._convert(): special calculation of timestamp value #123

Open drfho opened 3 years ago

drfho commented 3 years ago

Hello, the timestamp calculation in https://github.com/zopefoundation/Products.ZCatalog/blob/master/src/Products/PluginIndexes/DateIndex/DateIndex.py#L141 looks somehow surprising: is there a special reason for e.g. "yr*12"? In my opion a unix timestamp may be more adequate for the job, because its easier to process the timestamps back into a valid datetime value. Or is there any special reason to use that "internal" timestamp format and not a standard unix value for the zcatalog-index?

LINE 141
-         t_val = ((((yr * 12 + mo) * 31 + dy) * 24 + hr) * 60 + mn)
+        t_struct = datetime(yr,mo,dy,hr,mn).timetuple()
+        t_val = time.mktime(t_struct)

Any hint is welcome, looking forward fh

andbag commented 3 years ago

I don't know why the standard libraries were not used. Maybe there was a 2037 issue in previous python or unix (32bit) versions. At least the time unit of t_val is minutes and the type of the variable is not signed integer. I.e. your code should be like this

t_val = int(time.mktime(t_struct)/60)

The value is also missing the minutes that have passed until 1970. Once the code should be changed, the result of the new method is not the same of the old method. Maybe tests must be rewritten and all datetime indexes of running systems with datetime indexes must be rebuilt.

drfho commented 3 years ago

Sorry, the old code ((((yr * 12 + mo) * 31 + dy) * 24 + hr) * 60 + mn) seems not to compute minutes. I have no idea what the resulting number might be Any idea?.

Py code snippet for testing

import time
from datetime import date
from datetime import datetime
from DateTime.DateTime import DateTime

yr = 2021
mo = 6
dy = 9
hr = 12
mn = 0

### OLD
t_val = ((((yr * 12 + mo) * 31 + dy) * 24 + hr) * 60 + mn)
print("OLD", int(t_val))

### NEW
t_struct = datetime(yr,mo,dy,hr,mn).timetuple()
t_val = time.mktime(t_struct)
print("NEW",int(t_val))
>> OLD 1082890800
>> NEW 1623232800

Check Timestaps: https://www.confirado.de/tools/timestamp-umrechner.html

d-maurer commented 3 years ago

drfho wrote at 2021-7-9 04:18 -0700:

Sorry, the old code ((((yr * 12 + mo) * 31 + dy) * 24 + hr) * 60 + mn) seems not to compute minutes.

It does not need to: the indexed value is (in principle) an implementation detail. It can be computed any way so long as the computation ensures that all equivalent allowed input values are mapped to the same indexed value and that the computation happens identically on indexing and searching.

When I remember right, then a DateIndex wanted a reduced resolution (to make the index less bushy/broad for better efficiency). A resolution in the order of minutes was apparently chosen (which is far coarser than that of unit time stamps).

With some modulo computations, you can easily reconstruct "yr", "mo", "dy", "hr" and "mn" from the indexed value (as computed above).