NLeSC / mcfly

A deep learning tool for time series classification and regression
Apache License 2.0
363 stars 82 forks source link

Collection of research ideas #29

Open vincentvanhees opened 8 years ago

vincentvanhees commented 8 years ago

I am listing some research ideas we could look into after there is a first prototype. These may need to be split into seperate issues but for now it may be easier to group them:

  1. Is processing a multi-channel with nbcol = 1 the same as splitting up the data and processing them in separated sequences and the connecting the branches in a fully connected final layer? The code will be less compact in the latter case, but what is the impact on classification performance?
  2. Should dropout be applied in between convolutional layers and/or in between LSTM layers? What do other do and what is the impact on the performance?
  3. L2 regularisation is not mentioned in the article by Ordonez, is L2 regularization the unmentioned standard for CNN?
  4. What is the impact of return_sequence = True versus False in the LSTM layer?
  5. How many dense layers do we need to have equal classification performance as one LSTM layer? 6. How many convolutional layers do we need to have equal classification performance as three convolutional and three LSTM layers?
cwmeijer commented 8 years ago
  1. When training RNNs with LSTMs or something similar, does it help to keep the hidden state between batches so the batches aren't trained with an all zero initial state? This requires the mini batches to be synced so every ith sequence in batch j lies just in front of the ith sequence in batch j+1 in the source dataset. See #39 for some details.
vincentvanhees commented 8 years ago

https://epilepsy.uni-freiburg.de/freiburg Freiburg EEG dataset as possible benchmark?

dafnevk commented 8 years ago

Possible data sets to check out:

vincentvanhees commented 7 years ago

Copy-pasting #21 below, such that we can close it and have this issue as a collection of all research ideas until someone decides to devote time on any of these ideas.

Y Zhengh proposed in 2014 that splitting multi-variate time series into univariate signals and processing them seperately as distinct branches of the DL architecture is better than processing them as a multi-variate CNN... I am not sure whether this makes sense, but it should be something we can easily test in Keras. The article is on the onedrive, link: https://nlesc-my.sharepoint.com/personal/v_vanhees_esciencecenter_nl/_layouts/15/guestaccess.aspx?guestaccesstoken=cKHpfUmasCukMxT9YMnoLKvwtQiFlFYdJclcl%2buhcYM%3d&docid=17139ecaca7d5428ea3d184e04a4e59f5