Issue on reimplementation experiment

hanjanghoon / BERT_FP

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

96 stars 19 forks source link

Hi, authors of Bert-FP, the SOTA in Response Selection tasks. Excited to see that the post-training strategy works so well with the sub-context-response pairs. Recently, I try to reimplement this work, but I got some confusions. It might take you some time to do a little help.

In post-training (Section 4.2), it says "constructed 6M sub-context-response pairs for Douban", but it's about 2M when running the provided code (FPT/e-commerce_final.py). I can't find out what's missing.
Was EDC being pre-trained for 34 epochs with only one GPU card? Params: seq_len=240, train_batch_size=50.
About results without fine-tuning, BERT-FP-NF, how to convert the pre-training NSP 3-class (0,1,2) task into a 2-class (0,1) task?
About fine-tuning, default epoch is 2, is that the number for the reported results? Since BERT_finetuning.py reflects that patience is used when epoch > 2.

Thank you, really appreciate.

hanjanghoon / BERT_FP

Issue on reimplementation experiment #7