gogyzzz / localatt_emorecog

A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'
41 stars 17 forks source link

Same session under same data give different accuracy #3

Open twostay opened 5 years ago

twostay commented 5 years ago

Hi, I have several questions to ask:).

  1. After I ran the project with one session, I deleted the model.pth and ran another session(other files kept) which I had run before(running from scratch, no data files are left); however, the results were different by around 5%. So I want to ask what files are the variables related to the training process.
  2. What exactly does the variable "session" in make_dataset.py do? Thank you
gogyzzz commented 5 years ago
  1. I don't know why the model so different in accuracy. but I used my model before, the accuracy of my model was poor compared to original paper's. so there is another technique in original model (we don't know)

  2. The concept of "session" is used in IEMOCAP, MSP-IMPROV dataset. and "session" is similar to "fold" in n-fold cross-validation evaluation.

twostay commented 5 years ago

I fixed the issue, it was because the contents in training and testing used were different. And the accuracy issue was probably because of the attention implementation was incorrect in the model. You can take another look at the essay and the model

gogyzzz commented 5 years ago

@twostay Please let me know what I missed. It will also help other users.

twostay commented 4 years ago
  1. Unit standard variation normalization.
  2. The way the alpha multiplies with the input matrix was wrong. According to the essay, for the input of BLF, the product should be BLF and then sum to BF. However, the actual product is BBF(something like this), which still sum to BF. That's why it's hard to notice :(. You should use scalar multiplication instead of matrix multiplication in this case, as described in the essay.
  3. CrossEntropyLoss of PyTorch automatically applies a Softmax to the output when using it, adding another at the output of the model is redundant and may cause gradient explosion or disappearance(I've experienced it when writing a CIFAR-10 model).

By the way, u should try newer versions of PyTorch, it's much better, no Variable and Tensor stuff, only Tensors.

twostay commented 4 years ago

Do u mind telling me where did u download the MSP-IMPROV database, I can't find it:( Thx