Closed serenalotreck closed 2 years ago
There isn't currently an option for this, though it should be doable. Feel free to submit a PR. I can take a look this weekend and offer some suggestions on what might be the simplest way to do this.
I would love some suggestions, happy to take a stab at a PR based on what you think!
OK, I think I have a way to do it. You're interested in doing this for named entity recognition, right?
During training, metrics are calculated by this method call. Here's the method definition.
This loops over each class and computes false positives and false negatives. You could modify it to just give the model credit for predicting any label other than self.none_label
, which is 0
. Let me know if that doesn't make sense, I can try to provide some more details.
The quickest thing to do is probably just modify that method and change it back when you're done. If you want, you could add a command-line flag for this and figure out how to get it passed where it needs to go, but that might be more pain than it's worth.
That makes sense! I actually had to write a script for a separate method to do the same thing with the prediction output of the model, which was originally in brat format -- so I just used brat_to_dygiepp.py
to convert the output and wrote a script that also worked for my dygiepp predictions. I'll probably try what you've suggested to corroborate the output of what I wrote, but for the moment will just do the temporary version -- if someone comes along and reopens this issue wanting the same thing, I'll happily go through and add a command-line flag for that.
Sounds good, glad you found a fix.
Is there a way to use the
allennlp evaluate
command but indicate that I'd like to calculate performance as if all the entities just had one type (e.g.ENTITY
, like in scispacy)? I'm applying the SciERC model out-of-domain on a dataset that only has one entity type, and I just want to know how well the model finds entities, and don't care about what type it assigns them.