salesforce / ctrl

Conditional Transformer Language Model for Controllable Generation
https://arxiv.org/abs/1909.05858
BSD 3-Clause "New" or "Revised" License
1.87k stars 208 forks source link

warn if generation prompt does not start with a control code #50

Closed julien-c closed 5 years ago

dimitri320 commented 5 years ago

@julien-c great idea, totally get your commit, but how is the control code/token identified right now in the master branch? I don't get how control_codes.txt is being referenced from generate.py right now? Or am I missing something? I'm pretty sure I setup everything right on my V100, but the master branch just doesn't work (runs, but copies the last word over and over, so likely it doesn't see my control code) while the lower_memory branch works just fine. Totally lost to be honest...

julien-c commented 5 years ago

Did you try all the different models? Maybe it's an issue with a specific one?

dimitri320 commented 5 years ago

Yes, I’ve tried both the 512 and 256, and in both cases the master branch didn’t work, while the lower memory branch worked. That’s why I started looking into the code, and a bit lost where/how control codes are being used.

julien-c commented 5 years ago

A control code is just a token. So if you start with Joke A man comes into a bar "Joke" is just a token like the other ones. Not sure I can help on debugging your specific issue though. Good luck!

dimitri320 commented 5 years ago

@julien-c thanks a lot, will try running your code, maybe it’ll work. Just strange why my setup doesn’t;to see control code on the master branch.

dimitri320 commented 5 years ago

@julien-c I’ve got an idea about what might be wrong. I think I might be patching the wrong keras.py file. Can you share the path of this file you are patching (as your setup is GCP with V100, just like mine). Im also using Anaconda Python 3.7 virtual environment setup.

keskarnitish commented 5 years ago

Thanks for the PR @julien-c !

@dimitri320 , you should patch wherever tensorflow_estimator is. You can find this by

>>> import tensorflow_estimator
>>> tensorflow_estimator.__file__
dimitri320 commented 5 years ago

@keskarnitish Patched the correct file, and still I get the last work copied over and over again.... And I don't get the error coming up that no control word was used, as I am using control words (Links, Books, Wikipedia).

And btw, the new commit works, when I start with a non control word, it shows me the warning. Thanks for that @julien-c !

Any advice where else to look for an answer?

PS: already spent 3 days on this, really don't know what else to do...