emanjavacas / pie

A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
MIT License
22 stars 10 forks source link

Add support per class in the confusion table #33

Closed PonteIneptique closed 4 years ago

PonteIneptique commented 5 years ago

In the current state of the table, it's really hard to make sense from it as the support is quite unknown. For example, if I have NOMpro 120 times mislabeled 90 times as NOMcom, it's not the same situation as it being mislabeled 30 times.

emanjavacas commented 5 years ago

Sorry, I am not getting the issue, can you show the example?

PonteIneptique commented 5 years ago

Sure, and btw, I can't assign myself but that's something I'll be happy to do

Here is some output of a confusion matrix:

Expected Total Errors Predictions Predicted times
qui 243 quod 88
quis 82
quam 31
quo 27
qua 14
antequam 1

What I'd Like is something more like

Expected Total Errors Support Predictions Predicted times Support
qui 243 500 quod 88 300
quis 82 450
quam 31 50
quo 27 90
qua 14 200
antequam 1 500

That would help me see that qui is actually ~ 50% accurate and the biggest issue might be for quam here

emanjavacas commented 5 years ago

Sounds fine by me. Anyway the confusion tables for lemmatization are usually quite unwieldy, but it makes sense to have the marginal counts already there.

PonteIneptique commented 5 years ago

To be fair, for me, the example is more important for the other tasks (eg. Gender or Tense) :)