Closed shakeley closed 3 years ago
My finetuning process is as follows. The blue line represents valid
.
Hi,
The dict.extwordssourcetrunclead.txt, dict.targettrunclead.txt
files are the vocab files used by the model, which should be identical to your datasets/cnndm/dict.txt
if you process the dataset correctly.
Your training curve does seem incorrect though -- our validation ppl converges to ~3 finally but your ppl seems much larger, can you post your log file here as well?
BTW, you don't need to double train_steps
, it is the number of update steps thus unrelated to update_freq
(This is a small point, just FYI to save training time)
Thx for the explanations!
It seems that I only have a stdout.log
, which contains some infomation about the training process.
I think I found the issue -- the log says the bart checkpoint is not loaded:
2021-02-28 10:47:44 | INFO | fairseq.trainer | no existing checkpoint found /home/kelixie/ctrlsum/bart.large
The bart checkpoint path
passed to the script should be a .pt
file instead of a directory.
Ohhhh... That's a careless mistake :( Grateful for your patient and detailed reply. I can reproduce the results now!
❓ Questions on finetuing
I finetune the model by myself (from fairseq
BART.large
ckpt) usingtrain_bart.sh
in the repo andsrc=oraclewordsource
, and got a quite strange ROUGE score compared to that of the released ckpt.Code
I use 4 V100 GPUs so I change the
update_freq
to 16 to fit the original effective batch size (1024 x 8GPU x 8update_freq). The other parameters for finetuning are not changed. The exacttrain_bart.sh
I used is as follows:The score on valid set
After
40k steps
(actually not necessary, thx for comment) finetuning on CNN, I got a score onvalid
set as follows:The score of the released ckpt from the repo I obtained is:
The difference made me very confused. I must miss some important details.
I notice that the released tar.gz contains some extra files like
dict.extwordssourcetrunclead.txt, dict.targettrunclead.txt
, which are used for ckpt evaluation but not for my own finetuning seemingly. Is this one of the reasons for my problems? What are the two txt files?It would be very kind of you to help me. Thanks!