emanjavacas / pie

A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
MIT License
22 stars 10 forks source link

Add a metric for empty values #50

Open PonteIneptique opened 4 years ago

PonteIneptique commented 4 years ago

Hi Enrique ! I am thinking that run_check could use another metric for some tasks, which is Score (Not empty) (so accuracy, prediction, recall on those) . This is something we have using eval full with classes scores, but I do think that could be useful to track over training.

Basically, in task declaration have an optional not null "empty_value": "_" new key (I am thinking about the dictionary with name in it. And compute accuracy, prediction and recall on filtered results as well. While it's important to know the accuracy of empty values as well (Tense should not be applied to Nouns), morphological features are incredibly skewed (as an exemple, there is definitely less Tense occurrences than Case for example).

Inter-categorical information is also important as well but we do have that in eval (ie. morphological features that are not empty are also skewed data: there is less subjunctive perfect than there is present, less 2nd person than there is 3rd ones...)

emanjavacas commented 4 years ago

I realize I never answered to this! I think this would be a cool addition indeed.