Lab-ANT / Time2State

An unsupervised framework for inferring the latent states in time series data
MIT License
17 stars 2 forks source link

number of output rows doesnt match input rows #2

Closed elmiraberjisian closed 1 year ago

elmiraberjisian commented 1 year ago

Hi, Thank you for your paper and code. Ive applied time2state to my data (say 10548 observations) but noticed that number of output rows do not match the main dataset and missing one observation (i.e., 10547 observations). Was wondering if this is the first observation that doesnt get assigned? or where is the mismatch from? does it by any chance depend on w or s size? Thank you!

AprilCal commented 1 year ago

Hi, Thank you for your paper and code. Ive applied time2state to my data (say 10548 observations) but noticed that number of output rows do not match the main dataset and missing one observation (i.e., 10547 observations). Was wondering if this is the first observation that doesnt get assigned? or where is the mismatch from? does it by any chance depend on w or s size? Thank you!

Hi,

Some of the baselines do have this issue due to their label assignment algirithm. Time2State could ensure that the output and input have the same length. For Time2State, the window size w must be even, the step size s should be smaller than w. There are no other restrictions.

I have not encountered this issue so far. Could you please provide the parameters (e.g., w, s) for reproducibility?

elmiraberjisian commented 1 year ago

Thank you for your reply w=14 s=4 my input is a text file with first few rows for example like: -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345 -1.56012479071849,-0.065928795060486,-1.03616262014768,-0.00393704329015308,0.0169288893396338,0.00303243009423345

Moreover this is the code I use dataset_path = './input.txt' df = pd.read_csv(dataset_path, sep=',',header=None) data = df.to_numpy() params_LSE['in_channels'] = 6 params_LSE['win_size'] = 14 params_LSE['M'] = 10 params_LSE['N'] = 4 params_LSE['out_channels'] = 6 params_LSE['nb_steps'] = 40 params_LSE['win_type'] = 'hanning' win_size=14 step=4 t2s = Time2State(win_size, step, CausalConv_LSE_Adaper(params_LSE), DPGMM(None)).fit(data, win_size, step) prediction = t2s.state_seq prediction = np.array(prediction, dtype=int)

with open(os.getcwd()+'/output'+'.csv', 'w') as fo: writer = csv.writer(fo, delimiter=',', lineterminator='\n')

write the header

for y in prediction : writer.writerow ([y])

Thank you!

AprilCal commented 1 year ago

Hi, I failed to reproduce the problem you encountered. Here is my test code, which produces output of the same length with input data. image

The problem may come from other parts. Have you tried to check the length of input and output by data.shape and prediction.shape?

AprilCal commented 1 year ago

Here is the test code.

win_size = 14 step = 4

X = np.concatenate([np.sin(np.linspace(0, 10*np.pi, 10548)).reshape(-1, 1) for i in range(4)], axis=1) print(X.shape)

params_LSE['in_channels'] = 4 params_LSE['out_channels'] = 4 params_LSE['nb_steps'] = 2 params_LSE['win_size'] = win_size params_LSE['win_type'] = 'hanning'

t2s = Time2State(win_size, step, CausalConv_LSE_Adaper(params_LSE), DPGMM(None)).fit(X, win_size, step) print(t2s.state_seq.shape)

elmiraberjisian commented 1 year ago

I redid and didnt get the same error might have been something with the header in output sorry about that please feel free to delete. thanks