HiSPARC / sapphire

SAPPHiRE, a framework for HiSPARC
https://docs.hisparc.nl/sapphire/
GNU General Public License v3.0
7 stars 9 forks source link

`numpy.uint64` ext_timestamp fail to split into correct ts, ns pairs #190

Open tomkooij opened 5 years ago

tomkooij commented 5 years ago

See code below:

An ext_timestamp read from HDF5 is numpy.uint64 with different math rules (appearantly) as compared to int...

This might break a lot of (all) code that transforms ext_timestamp into ts, ns pairs.

(Fix: cast to int)

In [24]: ext_timestamp = 1547942403452849057
    ...: timestamp = int(ext_timestamp / int(1e9))
    ...: nanoseconds = int(ext_timestamp % int(1e9))
    ...: timestamp, nanoseconds
    ...:
Out[24]: (1547942403, 452849057)

In [25]: ext_timestamp % int(1e9)
Out[25]: 452849057

In [26]: et
Out[26]: 1547942403452849057

In [27]: et % int(1e9)
Out[27]: 452849152.0

In [28]: type(et)
Out[28]: numpy.uint64
153957 commented 5 years ago

This kind of calculation does not seem to be used very much (I looked for '% int(1e9)') in SAPPHiRE, only a few simulations and in time_deltas, but in both cases it appears that python ints are used, so that should not be an issue.

tomkooij commented 5 years ago

@153957 : I removed CRITICAL from the title. Maybe not so critical afterall.

It does bug me that this breaks when using numpy.uint64 vs int. (~This might actually be a numpy.uint64 bug/flaw~ EDIT: No, this is expected behaviour: int % numpy.uint64 will be evaluated by first casting both values to float. With the loss of precission.). We must cast everything to int before calculating the modulus.

I'd like to fix this by creating a sapphire.utils.split_ext_timestamp function that is safe for int and numpy.uint64. And just import / use that everywhere (also in publicdb).

But I just fixed it in publicdb.api.datastore for now.

153957 commented 6 months ago
from numpy import uint64

ext_timestamp = 1547942403_452849057
np_ext_timestamp = uint64(ext_timestamp)
ext_timestamp == np_ext_timestamp  # True

# timestamp
ext_timestamp // int(1e9)  # 1547942403
np_ext_timestamp // int(1e9)  # 1547942403.0

# nanoseconds
ext_timestamp % int(1e9)  # 452849057
np_ext_timestamp % int(1e9)  # 452849152.0
np_ext_timestamp % uint64(int(1e9))  # 452849057

# Using numpy <= 1.24
1400000002_000000050 == uint64(1400000002_000000049)  # True
# Using numpy >= 1.25 <- correct answer
1400000002_000000050 == uint64(1400000002_000000049)  # False

For comparisons (uint64 == int) the newer numpy does not cast to float64, but when performing calculations (uint64 + int) it does convert the values to float64. 🤯