mandubian / pytorch_math_dataset

Pytorch Playground for Mathematical Reasoning Dataset
Other
19 stars 7 forks source link

Beam search in dgl transformer #1

Open YongtaoGe opened 5 years ago

YongtaoGe commented 5 years ago

Really nice work! Have you found the reason why beam search fail in dgl transformer?

mandubian commented 5 years ago

Thks ;) No, I haven't yet searched for that, I've been working on other topics lately. But I'm not sure if it's an issue with beam search or just the model being limited in its learnt notions. (Let say it's not so fast to train with my own GPU)

YongtaoGe commented 5 years ago

I use this code to train the whole dataset with basic transformer model. The training process seems correct as the loss keeps decreasing. But I found that the model can't get the right answer of basic arithmetic add_sub_multiple questions like "Evaluate 1 + 1 + (4 - 7) - -2." In origin paper, this sub module get very high accuracy. So have you also met that problem.

loss log
mandubian commented 5 years ago

Do you mean that using a normal transformer, you have issues with beam search and not only with DGL transformer? If yes, it would help searching the cause because I had no idea yet...

YongtaoGe commented 5 years ago

yep! some modules work fine while others not. I could not locate the bug.