Evaluating lextag tagging performance

nert-nlp / streusle

STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)

Creative Commons Attribution Share Alike 4.0 International

63 stars 17 forks source link

Evaluating lextag tagging performance #40

Open nelson-liu opened 5 years ago

nelson-liu commented 5 years ago

Hi!

I'd like to build a system to predict each token's lextag---I think the evaluation script for this is streusleval.py?

If so, it doesn't seem like it's part of the latest release? Also, is the data the same between 4.1 and the master ref? Not sure what the release cycle looks like for STREUSLE, but could be nice to have a minor release with all the improvements since last July :)

nschneid commented 5 years ago

streuseval.py is for supersenses+MWEs (I didn't realize it postdated the last release). I don't think it gives lexcat precision and recall, but when it scores token-level tags there's a version including the lexcat.

I need to do a release soon that cleans up some of the preposition supersenses and updates UD to version 2.4.

nelson-liu commented 5 years ago

Ah ok, thanks for clarifying that!

I don't think it gives lexcat precision and recall, but when it scores token-level tags there's a version including the lexcat.

This sounds like what I want...I'm predicting the token-level tags. How is this different from lexcat precision and recall?

nschneid commented 5 years ago

The lexcat conceptually applies to the lexical expression, which could be a multiword expression (as opposed to the POS/dependency information, which is truly token-level). The lexcat is encoded in the token-level full tag as a matter of convenience for sequence taggers. To avoid redundancy, I tags continuing MWEs have no lexcat. If you are working with automatic MWEs, errors in the MWE analysis will affect how lexcats are counted.