Open francisduan opened 4 years ago
Signature matrices combine 10 points into single 3D tensor. So 2000 matrices means the code is using all of the 20K data points. Hope this helps.
Excuse me, I didn't find the operation of combine 10 points into single 3D tensor before that.
Sorry, my bad. Actually, there is a parameter called gap_time which indicates how frequently to sample data and create signature matrices. It is set to 10 by default, thus 2000 matrices are getting created from 20K points. You can change the gap_time depending on how many matrices you want to create. But entire data is being used for sure, as construction of each signature matrix at a particular time point takes into account past 60 points (max window size), so technically the information is not lost.
I suppose its meant to be:
for t in range(self.signature_matrices_number):
adj_t = t * util.gap_time + win
raw_data_t = raw_data[:, adj_t - win:adj_t]
signature_matrices[t] = np.dot(raw_data_t, raw_data_t.T) / win
For line 44 at generation_signature_matrice.py
for t in range(win, self.signature_matrices_number):
This does not make sense since your are just looping the first 2000 data. Is the result correct if you lose so much information?