Open rtavenar opened 4 years ago
Missing data is not incompatible with variable-length time series. You can have a time series whose length is 80 with no missing data and another time series whose length is 60 with missing data. Toy example:
How is it not compatible? Can't you easily distinguish between "missing values" and "padding" by the location of the NaN? If it's at the end -> padding, in the middle -> missing value
I would advocate for two different values (maybe np.nan and np.inf) to highlight the difference. But as Romain said, there is no imputation module for the moment so NaN are just used for padding values.
I said not incompatible so I think that we agree on this ^^
I received the following question by email*
I am not 100% sure what is implied by "handle missing data", but I can try to formulate an answer:
tslearn
does not have a missing data imputation moduletslearn
can provide methods that do not rely on the assumption that series to be compared are observed at the same time stamps. For example, if only the ordering of elements matter, one could use Dynamic Time Warping. Having a look at our user guide is likely to provide some input on this (at least I hope so).*I can no longer answer the questions regarding
tslearn
by email, so please post your questions as a GitHub issue to maximize your chances of getting an answer