Closed danich1 closed 5 years ago
The discriminator model uses the output of the generative model to make sentence classifications. In other words-> the generative model gives each sentence a confidence score that is the likelihood of mentioning a relationship. Then the discriminator model adds sentences features on top of the generative model's output to theoretically improve the final confidence score. In a perfect world the discriminator should out perform the generator, but as you can see this process is quite messy.
That's a really great question. Not sure the concrete answer there, but my main hunch is that class imbalance plays an important role with performance. CbG and GiG only have about less than 10% of positive sentences labeled, while DaG has about 50-50 split. Another pitfall is that the evaluation sets could use more labeled examples. Ideally I should have about 1k sentences labeled for each relationship, but given the circumstances that is not a trivial thing to do. Lastly, some relationships are easier to predict than others. It could be that DaG and CtD are easier to detect than GiG and CbG.
This PR contains the final results for the discriminator model. Not much code review needed. Just take a look at the figures generated. Let me know what you think.
Side Note: Data Files for this pull request will show up in a new one. Don't want to overwhelm you with files.