intel-analytics / analytics-zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
https://analytics-zoo.readthedocs.io/
Apache License 2.0
16 stars 3 forks source link

(automatic) time series data preprocessing #793

Open shane-huang opened 4 years ago

shane-huang commented 4 years ago

Data quality can significantly impact the results of the forecasting.

Currently we assume that input data is uniformly sampled in timeline and all missing values are filled. However in realworld we found many time series are not as cleaned as assumed. There's still gap in this part of processing, e.g. uniform sampling, fill missing values, dealing with outliers, etc.

There're several directions in this part of work.

shane-huang commented 4 years ago

Let's start with simple and traditional missing data imputation.