Input shape consistency with paper

vlawhern / arl-eegmodels

This is the Army Research Laboratory (ARL) EEGModels Project: A Collection of Convolutional Neural Network (CNN) models for EEG signal classification, using Keras and Tensorflow

Other

1.14k stars 284 forks source link

Hi there, thanks for releasing your code.

Assuming that the input has shape (C, T) where C is the number of channels and T is the number of samples in the window, according to Table 2 of the published version of the paper (EEGNet v4), the first layer of the network reshapes the input to a shape of (1, C, T) (so that the first convolutional layer with kernel size (1, 64) is applied along the temporal dimension (temporal convolution/filtering).

Screen Shot 2022-06-24 at 9 58 05 AM

By looking at the code though it seems to me that the input has shape (C, T, 1).
https://github.com/vlawhern/arl-eegmodels/blob/4a512e503198db2010848813ead9afbf8cd54c97/EEGModels.py#L127-L132

Should there not be a Reshape layer at the bottom of the network? Or am I missing something obvious here?

Tensorflow/Keras makes a distinction between 'channels_first' and 'channels_last' format when passing in data to Convolution layers.

channels_first (NCHW) for 2D input would be (channels, rows, cols) channels_last (NHWC) for 2D input would be (rows, cols, channels)

My understanding is that in early Tensorflow versions channels_first was preferred for convolution operations for computational reasons (for more technical info see NVIDIA's documentation https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html and this Stack Overflow thread https://stackoverflow.com/questions/44280335/how-much-faster-is-nchw-compared-to-nhwc-in-tensorflow-cudnn). In addition, in these early TF versions convolutions on CPU could only be done on NCHW if I remember correctly. So to maintain optimal compatibility with users who are using either CPU or GPU I used NCHW format when the EEGNet paper/code was initially released.

These restrictions no longer exist for more recent TF versions so I decided to switch out the formatting to NHWC which is the default in TF/Keras now. This was also done in part as I kept getting user issues regarding this (see https://github.com/vlawhern/arl-eegmodels/issues/13 and https://github.com/vlawhern/arl-eegmodels/issues/18).

Hope this helps.

vlawhern / arl-eegmodels

Input shape consistency with paper #41