vlawhern / arl-eegmodels

This is the Army Research Laboratory (ARL) EEGModels Project: A Collection of Convolutional Neural Network (CNN) models for EEG signal classification, using Keras and Tensorflow
Other
1.14k stars 284 forks source link

Input shape consistency with paper #41

Closed agamemnonc closed 2 years ago

agamemnonc commented 2 years ago

Hi there, thanks for releasing your code.

Assuming that the input has shape (C, T) where C is the number of channels and T is the number of samples in the window, according to Table 2 of the published version of the paper (EEGNet v4), the first layer of the network reshapes the input to a shape of (1, C, T) (so that the first convolutional layer with kernel size (1, 64) is applied along the temporal dimension (temporal convolution/filtering).

Screen Shot 2022-06-24 at 9 58 05 AM

By looking at the code though it seems to me that the input has shape (C, T, 1).
https://github.com/vlawhern/arl-eegmodels/blob/4a512e503198db2010848813ead9afbf8cd54c97/EEGModels.py#L127-L132

Should there not be a Reshape layer at the bottom of the network? Or am I missing something obvious here?

vlawhern commented 2 years ago

Tensorflow/Keras makes a distinction between 'channels_first' and 'channels_last' format when passing in data to Convolution layers.

channels_first (NCHW) for 2D input would be (channels, rows, cols) channels_last (NHWC) for 2D input would be (rows, cols, channels)

My understanding is that in early Tensorflow versions channels_first was preferred for convolution operations for computational reasons (for more technical info see NVIDIA's documentation https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html and this Stack Overflow thread https://stackoverflow.com/questions/44280335/how-much-faster-is-nchw-compared-to-nhwc-in-tensorflow-cudnn). In addition, in these early TF versions convolutions on CPU could only be done on NCHW if I remember correctly. So to maintain optimal compatibility with users who are using either CPU or GPU I used NCHW format when the EEGNet paper/code was initially released.

These restrictions no longer exist for more recent TF versions so I decided to switch out the formatting to NHWC which is the default in TF/Keras now. This was also done in part as I kept getting user issues regarding this (see https://github.com/vlawhern/arl-eegmodels/issues/13 and https://github.com/vlawhern/arl-eegmodels/issues/18).

Hope this helps.

agamemnonc commented 2 years ago

That makes sense, thank you!