EmilyAlsentzer / clinicalBERT

repository for Publicly Available Clinical BERT Embeddings
MIT License
673 stars 135 forks source link

Unable to reproduce results #2

Closed ZhaofengWu closed 5 years ago

ZhaofengWu commented 5 years ago

I can't seem to reproduce your results on MedNLI with the two released models using the same hyperparameters presented in your paper's Appendix B. You reported 84-85%, but I can only get to 81-82% on the test set. Do you know why? Are the reported results on the dev set or the test set? If relevant, I'm using the pytorch-pretrained-bert repo.

EmilyAlsentzer commented 5 years ago

The results are reported on the test set after hyperparameter tuning on the dev set. We found that the best parameters for the model trained on all notes was LR = 2e-5, batch size = 16, and epochs = 4. We'll be releasing our code shortly, but we did little to adjust the run_classifier script from the pytorch-pretrained-bert repo.

EmilyAlsentzer commented 5 years ago

We've looked into this further and discovered a small bug was causing us to report results on dev set for the MedNLI task only. Thanks for your help in discovering this! We've resubmitted to arxiv, and the new version should be available shortly.

sudarshan85 commented 5 years ago

The results are reported on the test set after hyperparameter tuning on the dev set. We found that the best parameters for the model trained on all notes was LR = 2e-5, batch size = 16, and epochs = 4. We'll be releasing our code shortly, but we did little to adjust the run_classifier script from the pytorch-pretrained-bert repo.

Do you have a timeline on when the code will be available?

cbockman commented 5 years ago

FYI @EmilyAlsentzer it appears--if I'm interpreting things correctly--that your v2 properly updates Table 2 for the MedNLI #s but not in the discussion text (section 4, "Notably, on MedNLI, clinical BERT actually yields a new state of the art, yielding a performance of 85.4% accuracy...").

EmilyAlsentzer commented 5 years ago

@cbockman thanks for the heads up, we've posted a v3 with the typo fixed.

Preprocessing code is available on the repo now. Apologies for the delay in posting!