Model Architecture - Githubissues

ruining99 commented 5 years ago

I constructed a 1D CNN classifier with Daniel George's model as a reference (below) DG model architecture

My code location in repository: \CNN\Classifier-Stage-1

There are several things I want to check:

Input dimension: Daniel George's input dimension is double the one of mine, so his sample_frequency (8192) is double that in my code (4096). I wonder if I should follow what he did or I can first try with 4096 hz?
Pooling layer: Firstly, I think the pooling layer should be 1D max pooling. Please let me know if this should change. I noticed in Daniel George's code the pooling layer is placed before activation layer. Yet, in Keras the activation function is combined into the 1D Convolution layer. I am not sure why he did that but I don't think this is a big issue. Also, is drop-out layer a potential substitute for pooling? I wonder if it will work as well.
Compiler parameters I used adam as optimiser, which seems to work well for time series data. For loss function I used crossentropy, which is what DG used too. Please let me know if any of the above should change.

Thank you! Ruining

nathan-jm commented 5 years ago

I think that a sampling rate of 4096 Hz is fine for initial work--it should be faster to train than 8192 Hz.

I agree that the pooling should be 1D, though I'm not sure whether max or average pooling is more appropriate here. Unless Dr. Markakis has a specific suggestion, I would try both.

I think that it's best to replicate Daniel George's architecture as closely as possible, at least at first, so you can just use the default linear activation in the Keras convolution layer and apply the ReLU activation after pooling.

I think that a drop-out layer might not be a good substitute for pooling, since one would be increasing the size of the full neural network, so it might become very expensive to train.

I'll let Dr. Markakis comment on the other items.

ruining99 commented 5 years ago

I found a stack-exchange post on the sequence of pooling layers and activation layers. It seems like the sequence yields the same result in our case since ReLU (the activation function) we are using is a monotonely increasing non-linearity. But since the computational cost is lower if there are less neurons, I will apply the pooling layer first.

The post is here: https://stackoverflow.com/questions/35543428/activation-function-after-pooling-layer-or-convolutional-layer

ruining99 / Gravitational-Wave-CNN

Model Architecture #2