Why I read the fer2013.csv only 13000 more data?

isseu / emotion-recognition-neural-networks

Emotion recognition using DNN with tensorflow

MIT License

835 stars 308 forks source link

Why I read the fer2013.csv only 13000 more data? #6

Open renhui19931001 opened 7 years ago

renhui19931001 commented 7 years ago

I use the script of cvs_to_numpy.py,and download the dataset in https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data and then ,I try to get the .npy from the script but I only get 13000 more data from it

mysayalHan commented 7 years ago

I thought he used opencv to downsample the data. Unless the face in the picture could be found by xml in opencv, it could not be training data.

dearhoper commented 7 years ago

The total count of the images in the dataset is 35887. 28709 for Training, 3589 for PublicTest and another 3589 for PrivateTest. cvs_to_numpy.py filters the images according to whether or not found a face in the image. So, only 13000+ images through the selection.

sakshamjindal07 commented 7 years ago

Hi @dearhoper ,

I was successful in splitting the dataset into train and test and hence, I was able to generate 4 files as listed:

self._images         = np.load('data_set_fer2013.npy')
self._labels           = np.load('data_labels_fer2013.npy')
self._images_test  = np.load('test_set_fer2013.npy')
self._labels_test    = np.load('test_labels_fer2013.npy')

No. of images in data_set = 10809 No. of images in test_set = 3157

The problem which I am facing is that when the training of the model starts, the tflearn generates a log :

[+] Training network

Run id: emotion_recognition Log directory: /tmp/tflearn_logs/

Training samples: 10809 Validation samples: 10809

I just want to know if you faced a similar issue where you the training sample and validation sample had the same no. of images. I tried to dig into the TFlearn library but could not find any work around.

Can you help me point out the issue with this ?

Regards Saksham

dearhoper commented 7 years ago

Hi @sakshamjindal07 ,

There is a bug in the dataset_loader.py file, See the bold words as follows: ... def load_from_save(self): self._images = np.load(join(SAVE_DIRECTORY, SAVE_DATASET_IMAGES_FILENAME)) self._labels = np.load(join(SAVE_DIRECTORY, SAVE_DATASET_LABELS_FILENAME)) self._images_test = np.load(join(SAVE_DIRECTORY, SAVE_DATASET_IMAGES_TEST_FILENAME)) self._labels_test = np.load(join(SAVE_DIRECTORY, SAVE_DATASET_LABELS_TEST_FILENAME)) self._images = self._images.reshape([-1, SIZE_FACE, SIZE_FACE, 1]) self._images_test = self._images_test.reshape([-1, SIZE_FACE, SIZE_FACE, 1]) self._labels = self._labels.reshape([-1, len(EMOTIONS)]) self._labels_test = self._labels_test.reshape([-1, len(EMOTIONS)])

Rloguzzo commented 7 years ago

@sakshamjindal07 How did you get the test data?

NikAndrush commented 7 years ago

@dearhoper how i can get the test data?

dearhoper commented 7 years ago

@NikAndrush Download the FER2013 dataset and exact its content: https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data. These are 3 fields in the fer2013.csv. The "Usage" field means the use of the current image. "Training" is expressed as training data, and "PublicTest" is expressed as test data. (Data marked as "PrivateTest" has not been adopted here.) You can use cvs_to_numpy.csv to parse the training data and test data.

asthasharma017 commented 6 years ago

Is the data set removed from [https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data.] ? I am getting 404 error for this.

jingyugao commented 6 years ago

@asthasharma017 you should first sign up an account, and then you will get the file.