About training parameters

kongkong935 commented 1 week ago

Hello, I think your work is excellent, but when I was reproducing the code. The BLEU obtained from training the IU dataset peaked at 0.46 in the fourth epoch, but gradually decreased and stabilized at around 0.25 thereafter. May I ask what the problem is?

kongkong935 commented 1 week ago

But when I tested the weight file you provided, I did get a score of 0.525. Is it because I directly used the images300_array you provided instead of generating it myself? Or is the IU dataset smaller and more susceptible to hyperparameters?

hnjzbss commented 1 week ago

Hello, I think your work is excellent, but when I was reproducing the code. The BLEU obtained from training the IU dataset peaked at 0.46 in the fourth epoch, but gradually decreased and stabilized at around 0.25 thereafter. May I ask what the problem is?

Thank you for your interest in EKAGen. For text generation tasks, when training on small datasets, we have observed similar experimental phenomena across many models. The model can be quite sensitive to factors like batch size and even the GPU. IU X-ray suggests trying to adjust hyperparameters, such as reducing the batch size.

hnjzbss / EKAGen

About training parameters #8