The usage of pd.TimeStamp as the way to hold time information is a solid of there are integer number of samples per second.
However, there are many sample rates that do not have this property. Say data were sampled at 3Hz, or 30kHz. The timestamps are then not integer nanoseconds. The only way to use integer nanosecond time stamps is to relax the normal assumption of uniform sampling.
This may not be a big deal, but it would be nice to have some reference material in the package that addresses best practices for these data -- it maybe that the right answer is just to resample_poly up to a nearby sample rate that is represented by integer nanosecond timestamps. Or maybe we generate a slightly non-uniform time axis using pandas equivalent of linspace(start, end, num_samples), but it would be nice to have some tests that assure that sample rates returned by these are correct, or at least quantify the errors.
Note that resisitics uses the attotime package to deal with this (I think).
One other thing:
Xarray is quite strict when merging time series if timestamps are not exact matches. We generate time_stamps in various places in the code, and these could perhaps all refer to a single base class that makes timestamps. This would reduce risk of timestamp mismatches due to "gates and fenceposts" conventions (i.e. does a time stamp refer to an interval $\Delta$t wide or an instant, and ensure things like end-start is the actual duration of the time series, not off by one sample.
The usage of pd.TimeStamp as the way to hold time information is a solid of there are integer number of samples per second.
However, there are many sample rates that do not have this property. Say data were sampled at 3Hz, or 30kHz. The timestamps are then not integer nanoseconds. The only way to use integer nanosecond time stamps is to relax the normal assumption of uniform sampling.
This may not be a big deal, but it would be nice to have some reference material in the package that addresses best practices for these data -- it maybe that the right answer is just to resample_poly up to a nearby sample rate that is represented by integer nanosecond timestamps. Or maybe we generate a slightly non-uniform time axis using pandas equivalent of
linspace(start, end, num_samples)
, but it would be nice to have some tests that assure that sample rates returned by these are correct, or at least quantify the errors.Note that resisitics uses the attotime package to deal with this (I think).
One other thing: