Justin1904 / TensorFusionNetworks

Pytorch Implementation of Tensor Fusion Networks for multimodal sentiment analysis.
169 stars 44 forks source link

F-score #9

Closed EgorLakomkin closed 5 years ago

EgorLakomkin commented 5 years ago

Hi,

I am curious why the F-score is returned with the average parameter set to 'binary'? Doesn't it lead to reporting the F-score only for the 1 class, meaning positive sentiment? Wouldnt it be fair to have a macro F-score, where individual negative and positive class performance is averaged? The performance of the 0 class is kind of ignored in the 'binary' setup.

Justin1904 commented 5 years ago

That's a good point. Let me check with the authors of the paper and get back to you. Though I think in a binary classification case it is also common to just calculate F1 w.r.t positive class.

Justin1904 commented 5 years ago

I could confirm that the authors used the 'binary' setting in the F1 score. The reasoning behind that I am not exactly sure, but I guess for this particular task the regression loss (i.e. MAE) could reflect how well the model is doing.

EgorLakomkin commented 5 years ago

thanks @Justin1904 !