DTUComputeCognitiveSystems / deep_detektor

Automated Factual-Claim Detection in Danish Broadcasting
Other
2 stars 0 forks source link

Evaluation measures #3

Closed sfvnielsen closed 7 years ago

sfvnielsen commented 7 years ago

How should we quantitatively evaluate the performance of the network we train?

sfvnielsen commented 7 years ago

The problem is highly imbalanced (class-wise) so we need to use metrics that take this into account (ROC, F1,...)

NorthGuard commented 7 years ago

I'm working on a general framework for scores. F1 will be included in initial commit.

NorthGuard commented 7 years ago

Last commits include a lot of measures. Requires xarray to print it all though (pandas multidimensional brother).