unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.3k stars 307 forks source link

Built in checks for Time series dataframes #842

Open vignkri opened 2 years ago

vignkri commented 2 years ago

Is your feature request related to a problem? Please describe.

DataFrames indexed by either DateTimeIndex or derived indexes need to have specialised tests verifications. There are 3 different types of time based indexes supported in Pandas.

They have different internal type structure depending on the type. Adding built-in checks on top of the different indexes could be useful.

Describe the solution you'd like

Describe alternatives you've considered

Additional context Add any other context or screenshots about the feature request here.

Edits

cosmicBboy commented 2 years ago

Amazing @vignkri !

Super excited to get built-in support for this :)

So for each of the tests you list above, it would help if you could provide code snippets that implement the checks using the vanilla pandas API (ideally on a pd.Series of the DateTime, Period, and Timedelta dtypes)... and once we have those it'll be super straight-forward to add them to the pa.Check class namespace.

tkaraouzene commented 11 months ago

Hello !

Any update on this topic I would be very interested to use this kind of feature. I would add some additional tests for DatetimeIndex:

  1. check no duplicates
  2. check index is monotonic increasing
jonmather commented 6 months ago

I would also love to get this feature added! Anything that's blocking getting this done? I can provide snippets for some of these tests if helpful

cosmicBboy commented 6 months ago

Contributions welcome 🤗