mlgig / sktime

A scikit-learn compatible Python toolbox for machine learning with time series
https://alan-turing-institute.github.io/sktime/
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

SFA version can't handle the multivariate data yet #2

Open ashishsinghucd opened 4 years ago

ashishsinghucd commented 4 years ago

https://github.com/mlgig/sktime/blob/c6e8568f4563dd40bdadbf098f5d84164a47e49e/sktime/transformers/dictionary_based/SFA.py#L105

The code clearly mentions that the SFA does not support the multivariate format yet. The reason that it is working for dataset "load_basic_motions" is that it is only considering the first dimension. Also, this attribute should have the same length for all the records else the execution fails.

lnthach commented 4 years ago

Thanks Ashish. I managed to reproduce the error. Being multivariate is not the issue since only univariate data of each dimension was passed over to SFA (mrseql.pyx, line 271). The problem is the variable length. SFA later converts the dataframe to a numpy array (SFA.py, line 101) and gets the number of columns from the array shape (SFA.py, line 107). If the length is variable, array shape only contains 1 values ( e.g. (610, ) ) hence the failed execution. Unfortunately, there is nothing I can do for this bug. Probably the only way to go around this problem is padding the data or resampling to same length.