harvardnlp / encoder-agnostic-adaptation

Encoder-Agnostic Adaptation for Conditional Language Generation
https://arxiv.org/abs/1908.06938
MIT License
79 stars 13 forks source link

Troubles training a simple network #6

Closed BeersInTLV closed 4 years ago

BeersInTLV commented 4 years ago

Hi, thanks for the great code and paper.

I'm trying to train a simple task, as a POC, where the (using the paper annotation) X is the first 4 words of a sentences and Y is the sentence it self From my experience this should take not too long before convergence. But I'm not able to train quickly enough (After training of around an hour) it works. Any implementation hints how to make it work? Maybe it a preprocessing issue? Maybe the fact that we have some pretrained components and a few randomly initialized weights making it harder to train in a quick manner?

I'm running similarly to the CNN psa config (minus the Copy)

Thanks.