plkmo / BERT-Relation-Extraction

PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
Apache License 2.0
565 stars 132 forks source link

Standard batching vs NCE? #46

Closed varun-tandon closed 1 year ago

varun-tandon commented 1 year ago

Hi! Thanks so much for providing this implementation of the MTB training strategy.

I noticed that in the paper the authors use noise contrastive estimation in the training scheme, whereas in this implementation there seems to be an internal batching flag within the dataloader in preprocessing_funcs, which enables noise contrastive estimation.

See here: https://github.com/plkmo/BERT-Relation-Extraction/blob/06075620fccb044785f5fd319e8d06df9af15b50/src/preprocessing_funcs.py#L287

Is there a reason for this decision? Has anyone tried using standard batching rather than NCE?

I'll also try standard batching myself and update this thread if I have any meaningful results.