hasanhuz / SpanEmo

SpanEmo
Other
59 stars 21 forks source link

I want to ask a few questions. #4

Closed tengwang0318 closed 2 years ago

tengwang0318 commented 2 years ago

the paper is very awesome, I just have a problem, which I don't understand totally.

H_i = Encoder([CLS]+|C| + [SEP] + s_i)

Does |C| in here is already chosen emotion category? Does C is fixed?

hasanhuz commented 2 years ago

Hi there, thanks for your interest in our paper. |C| corresponds to the number of emotions and we used label names like anger, fear, joy, etc. Yes, it's fixed. Hope this helps

tengwang0318 commented 2 years ago

Thank u !

tengwang0318 commented 2 years ago

I got another problem that I couldn't understand the LCA loss equation. To understand it, I have just read the original paper, which name is 'Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization'. I also can't understand it. LCA loss is as follows: image I think that this equation will penalize this situation that y_p is much bigger than y_q and won't penalize this situation that y_p is much smaller than y_q. I don't think the LCA loss makes sense. I would appreciate it if you could explain the LCA loss!

hasanhuz commented 2 years ago

Yes, we want y_p to be smaller than y_q. Eq (3) compares a pair of emotions, where each is obtained from the positive (pos) and negative (neg) set, respectively. The number of comparison is equal to the number of emotions in pos * neg set. This helps to penalise the model when it predicts labels that shouldn't co-exist together. You might ask how we obtained label-label correlation, we extract label co-occurrences from the data. I'd suggest that you go over our implementation of the loss and then provided it with a few examples. I think this would help you a lot.

tengwang0318 commented 2 years ago

Thank you for replying to me!!! I have read your implementation. Maybe I haven't clarified my question. y_p is smaller than y_q means that the possibility of negative is smaller than positive. Does this make the model has the tendency that gives a high possibility for the positive and a low possibility for the negative? I don't know why the equation could work. I think exp(abs(y_p - y_q)) can work better?

hasanhuz commented 2 years ago

Yes, this helps the model to give a high possibility for highly correlated emotions and a low possibility for less correlated emotions. If you'd like to have strong penalization, you can also follow the implementation of the original paper which negate the output of each comparison before using the exp function. This would have a similar output as the abs function. In our case, since we trained the LCA loss jointly with cross-entropy, we decided to go that way and this appears to work pretty well. Hope this helps :)

tengwang0318 commented 2 years ago

thank you! I will go through the original paper and do some experiments!