Open JaniceXiong opened 2 years ago
Hello, just to confirm, did you change the fairseq/trainer.py file? I tried to handle the mismatch issue here: https://github.com/neulab/guided_summarization/blob/ea4bbe91f189cdb51f7f6a827210f9adc5319b3c/bart/fairseq/trainer.py#L173-L207.
Thanks! The code handling mismatch (line 185-207) is lost in my trainer.py. After I fix it, the training program goes well.
Hi, thank you for releasing trained model. But if I want to train bart.large on my custom dataset from the beginning, and set model_path to fairseq bart.large, raise the exception like below. And it seems that the exception is caused by "architectures dismatch". And I want to know where I should make a change to initialize these parameters which I did not find in z_train.sh you provided.
And for the issue #32, even when I remove --max-sentences 1 like you said, the ZeroDivisionError still exists if I want to train using multi-GPU. Only if I use one GPU, the error disappeared but it's too slow to train such a big model.
Thanks for your kindly help :) @zdou0830