Closed MarcoGorelli closed 2 months ago
This looks like an issue with pyarrow.
In [2]: import pyarrow as pa
In [3]: s = pd.Series(pd.date_range('2000', periods=3, freq='15ms'))
In [4]: pa.Array.from_pandas(s)
Out[4]:
<pyarrow.lib.TimestampArray object at 0x00000183A8E793C0>
[
2000-01-01 00:00:00.000000000,
2000-01-01 00:00:00.015000000,
2000-01-01 00:00:00.030000000
]
In [5]: pa.compute.microsecond(pa.Array.from_pandas(s))
Out[5]:
<pyarrow.lib.Int64Array object at 0x00000183A8E7BDC0>
[
0,
0,
0
]
thanks for checking - looks like it's not a bug in pyarrow, but just that pyarrow's method does something different
https://arrow.apache.org/docs/python/generated/pyarrow.compute.microsecond.html
Millisecond returns number of microseconds since the last full millisecond
So, I think this needs working around in pandas
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
For non-pyarrow backed dtype, it returns the total number of microseconds since the last second
For pyarrow-backed, it just returns 0
Expected Behavior
Installed Versions