greenelab / snorkeling

Extracting biomedical relationships from literature with Snorkel 🏊
Other
58 stars 17 forks source link

Model calibration #99

Closed danich1 closed 4 years ago

danich1 commented 4 years ago

This PR shows the results of calibrating the discriminator models. Deep Learning models tend to be over confident in their predictions. As a consequence there could be an inflation of high scores for sentences and vise versa. To resolve this I used a temperature scaling approach. Each notebook contains a figure showing the results after calibration.

For swiftness of reviewing this PR just look at the notebooks.

danich1 commented 4 years ago
  1. Close. In a calibration curve the actual score is the fraction of positive labels located in that bin For example at 0.8 approximately 80% of the bin belongs to the positive class.This resource provides a bit more intuition on how calibration works.

  2. Right. The idea is to get the dotted line to line up with the dashed line as close as possible. By accomplishing then we can argue the model's predictions are more reliable.