Remove equidistant time-spacing assumption in computing approximate derivative

neuroinformatics-unit / movement

Python tools for analysing body movements across space and time

http://movement.neuroinformatics.dev

BSD 3-Clause "New" or "Revised" License

96 stars 8 forks source link

Remove equidistant time-spacing assumption in computing approximate derivative #268

Closed lochhh closed 1 month ago

lochhh commented 1 month ago

Currently when computing approximate derivatives, we assume equidistant time-spacing with a fixed dt: https://github.com/neuroinformatics-unit/movement/blob/e7ccfe50f65b38dc1a05fc78e1a324df808d6198/movement/analysis/kinematics.py#L105-L113

We should instead provide the actual time axis values, i.e. data.coords["time"].values. Since we do not need to compute derivatives along multiple axes (what np.gradient() can do and xr.differentiate() can't), an even simpler solution would be to use xr.differentiate() (as @sfmig has pointed out in #265), replacing the above lines with:

for _ in range(order): 
     result = result.differentiate("time")

sfmig commented 1 month ago

I wonder if they differ in performance - maybe we could run a quick time test comparing our current implementation and differentiate 🤔

lochhh commented 1 month ago

I wonder if they differ in performance - maybe we could run a quick time test comparing our current implementation and differentiate 🤔

I think they use np.gradient().

niksirbi commented 1 month ago

This sounds like an unambiguously good idea to me 👍🏼

niksirbi commented 1 month ago

I'm also all in for using xr.differentiate().

lochhh commented 1 month ago

I wonder if they differ in performance - maybe we could run a quick time test comparing our current implementation and differentiate 🤔

# input:
dlc_ds.postion.shape = (59999, 2, 12, 2)

# xr.differentiate
%%timeit
dlc_ds.position.differentiate("time")
# 24.2 ms ± 1.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# numpy
result = xr.apply_ufunc(
    np.gradient,
    dlc_ds.position,
    dlc_ds.time.values,
    kwargs={"axis": 0},
)
# 23.4 ms ± 652 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

but because we need to reindex after applying np.gradient, this takes longer

%%timeit
result = xr.apply_ufunc(
    np.gradient,
    dlc_ds.position,
    dlc_ds.time.values,
    kwargs={"axis": 0},
)
result = result.reindex_like(dlc_ds.position)
# 32 ms ± 1.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

sfmig commented 1 month ago

@lochhh thanks for timing it 🤩