dmbee / seglearn

Python module for machine learning time series:
https://dmbee.github.io/seglearn/
BSD 3-Clause "New" or "Revised" License
570 stars 63 forks source link

Question about data representation #34

Open chkoar opened 5 years ago

chkoar commented 5 years ago

How can I work with seglearn if I have a data representation that is presented here.

I have two cases. In the first case I have a variable that is time dependent so I would like to extract features from the previous values in order to build the X matrix.

2011-01-01 01:00:00    1.073392
2011-01-01 02:00:00    0.274406
2011-01-01 03:00:00    1.446233
2011-01-01 04:00:00   -0.035727

In the second case I have the same problem but having along one or more dependent (and time dependent) variables that I want to use them in order to predict the third one.

dmbee commented 5 years ago

seglearn has transformers Interp and InterpLongToWide - the latter which I believe may suit your needs. InterpLongToWide converts long format dataframe to wide format, and can use various interpolation schemes to deal with missing data / irregular sampling.

Let me know if that works for you.

dmbee commented 5 years ago

The details this are in the API documentation - but probably the main documentation / user guide should be updated to reflect this use case.

chkoar commented 5 years ago

Thanks for the reply. From the User Guide it wasn't obvious for me. I will check them out. One thing that it needs improvement are the examples in the docstrings. They are missing from some classes.

dmbee commented 5 years ago

Thanks - will work on those as well.

dmbee commented 4 years ago

I put the details for InterpLongToWide in the user guide. Will work on docstring examples as well.