cesium-ml / cesium

Machine Learning Time-Series Platform
Other
671 stars 101 forks source link

`assemble_featureset` does not take channels into consideration #280

Open stefanv opened 5 years ago

stefanv commented 5 years ago

(Thanks to @sarajamal57 for finding this issue.)

Here, we featurize two time series, but with different channels. When using assemble_featureset to join them, the channel information is ignored, and unrelated features are joined.

from cesium import time_series, featurize, features
import numpy as np
import copy

channels = ['g', 'r', 'i']
t = np.linspace(0, 2*np.pi, 200)
m = np.repeat(np.sin(t**2)[None, :], len(channels), axis=0) * np.array([1, 2, 3])[:, None]
e = np.ones(t.shape[0])

ts = time_series.TimeSeries(t, m, e, channel_names=channels)
feats = featurize.featurize_single_ts(ts, [features.GENERAL_FEATS[0]])

m2 = np.vstack((m[0], m[2]))
ts2 = time_series.TimeSeries(t, m2, e, channel_names=[channels[0], channels[2]])
feats2 = featurize.featurize_single_ts(ts2, [features.GENERAL_FEATS[0]])

df = featurize.assemble_featureset([feats, feats2], [ts, ts2])

df is:

feature amplitude
channel         0         1         2
NaN      0.999894  1.999789  2.999683
NaN      0.999894  2.999683       NaN

Instead of

feature amplitude
channel         0         1         2
NaN      0.999894  1.999789  2.999683
NaN      0.999894  NaN       2.999683