tzirakis / Multimodal-Emotion-Recognition

This repository contains the code for the paper `End-to-End Multimodal Emotion Recognition using Deep Neural Networks`.
http://ieeexplore.ieee.org/document/8070966/
BSD 3-Clause "New" or "Revised" License
238 stars 76 forks source link

Method get_jpg_string() usage. #3

Open Ulitochka opened 6 years ago

Ulitochka commented 6 years ago

Hello, thanks for your work.

In file data_provider.py we prepare a tfrecords files, but we preprocess only audio data: features=tf.train.Features(feature={ 'sample_id': _int_feauture(i), 'subject_id': _int_feauture(subject_id), 'label': _bytes_feauture(label.tobytes()), 'raw_audio': _bytes_feauture(audio.tobytes()), }))

But in fle data_generator.py we loading tfrecords with frames data: features={ 'raw_audio': tf.FixedLenFeature([], tf.string), 'label': tf.FixedLenFeature([], tf.string), 'subject_id': tf.FixedLenFeature([], tf.int64), 'frame': tf.FixedLenFeature([], tf.string), }

The question is: when and where we encode frame data?

hemnf commented 6 years ago

I have the same concern, based on the paper, they should use ResNet50, and I didn't see anything like this. Also it gives me error, about the frame name..

@Ulitochka Have you fixed your problem? @tzirakis wish to have reply from you..