Open georgemilosh opened 2 years ago
Have you not done this with the label_period_start
& company?
No. label_period_start
etc were technically necessary for me, because of the way the structure of the functions that were designed, precluded me from using information outside the label range for some other purpose: for example if I want to PCA on full X (so that I can also use tau<0
) but still train the classifier on just the label range for each tau
.
Also recently when I tried to write a generator for time series, I decided to use the difference between label_period_start
and time_start
as the length of the sequence to LSTM
. But it seems that PCA->LSTM performs the same as just PCA->NN. At least now I am more confident that temporal information somehow does not play a role.
Choose labels which correspond to the first few days of the onset of heat waves and see what happens. Currently we mix the labels of heat waves which correspond to already started heat waves (that take 14 days) and heat waves that have been on-going for a long time (and are still above 5 %). This potentially destroys dynamical information and prevents us from extracting it from the DNN