chakki-works / seqeval

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
MIT License
1.09k stars 129 forks source link

Proposal to get TP, FP, FN of all targets #80

Open sathyarr opened 3 years ago

sathyarr commented 3 years ago

Feature description

Proposal to get TP, FP, FN of all targets:

At a certain point, we have three different arrays for all the targets, tp_sum, pred_sum, true_sum

e.g.,

target_names = ['person', 'org', 'place']
tp_sum   = [1 0 0]
pred_sum = [1 0 1]
true_sum = [1 1 0]

Here, tp_sum is used as numerator of both Precision and Recall calculations.

So, tp_sum is nothing but a collection of True Positive counts of each target.


pred_sum is being used as denominator of Precision. As per Precision's definition then, pred_sum == TP + FP

Here, we already know the count of TP (from tp_sum). To get FP, we shall do, FP = pred_sum - tp_sum


true_sum is being used as denominator of Recall. As per Recall's definition then, true_sum == TP + FN

Here, we already know the count of TP again (from tp_sum) To get FN, we shall do, FN = true_sum - tp_sum


Does this sound logical?

If so, better to add a function for this, like performance_measure?