Closed diogo-cruz closed 1 year ago
current version at: ac108c83563e314297bb1747038b92c16df387ab
this only concerns itself with putting together small components and forming a skeleton, later to be filled with the real model and dataset.
Now that we are loading GPT-2 again, we have this error again:
[WARNING] Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
I think @EdoardoPona solved this last time, right?
Yes, we need to manually set the pad and eos tokens to be the same thing. GPT2 doesn't do this though, so I'm not sure it will work given that this time we are actually using the real pre-trained GPT2. Unless this disrupts training / logging much, it isn't an issue as far as convergence is concerned.
while we are still figuring out how to get this to work perfectly, it is beyond this specific issue
This consists of: