Evaluating model without considering entity types

dwadden / dygiepp

Span-based system for named entity, relation, and event extraction.

MIT License

569 stars 120 forks source link

Evaluating model without considering entity types #78

Closed serenalotreck closed 2 years ago

serenalotreck commented 2 years ago

Is there a way to use the allennlp evaluate command but indicate that I'd like to calculate performance as if all the entities just had one type (e.g. ENTITY, like in scispacy)? I'm applying the SciERC model out-of-domain on a dataset that only has one entity type, and I just want to know how well the model finds entities, and don't care about what type it assigns them.

dwadden commented 2 years ago

There isn't currently an option for this, though it should be doable. Feel free to submit a PR. I can take a look this weekend and offer some suggestions on what might be the simplest way to do this.

serenalotreck commented 2 years ago

I would love some suggestions, happy to take a stab at a PR based on what you think!

dwadden commented 2 years ago

OK, I think I have a way to do it. You're interested in doing this for named entity recognition, right?

During training, metrics are calculated by this method call. Here's the method definition.

This loops over each class and computes false positives and false negatives. You could modify it to just give the model credit for predicting any label other than self.none_label, which is 0. Let me know if that doesn't make sense, I can try to provide some more details.

The quickest thing to do is probably just modify that method and change it back when you're done. If you want, you could add a command-line flag for this and figure out how to get it passed where it needs to go, but that might be more pain than it's worth.

serenalotreck commented 2 years ago

That makes sense! I actually had to write a script for a separate method to do the same thing with the prediction output of the model, which was originally in brat format -- so I just used brat_to_dygiepp.py to convert the output and wrote a script that also worked for my dygiepp predictions. I'll probably try what you've suggested to corroborate the output of what I wrote, but for the moment will just do the temporary version -- if someone comes along and reopens this issue wanting the same thing, I'll happily go through and add a command-line flag for that.

dwadden commented 2 years ago

Sounds good, glad you found a fix.