Open bytes-commerce opened 4 years ago
same problem
also getting weird output like this.
First of all, thank you for sharing your code! Helped me a lot starting with gpt2. I really do not know if this is relevant but I just debugged sample.py.
output will only append zeros:
tf.Tensor([[ 3 13727 5825 0 0 0 0 0 ...]], shape=(1, 515), dtype=int32)
If my sequence length is 512 — I will get 512 zeros (+3 above zero numbers because of my context). My output is just the words I have provided as context because the rest is 0.
edit 1: logits is always nan in my case resulting in 0.
edit 2: self.embedding_weights is nan. Maybe somethings wrong with the initializer?
First ofd all, thanks for providing this amazing repository providing a possibility for tf2! Secondly, I were using the Readme to pre-train my model and eventually using sequence_generator.py to pass some context to the model.
However, the response is always 1:1 the same as the context but the capital letters are being replaced with ??s. The question now is, what am I doing wrong? Have I maybe forgotten a thing? Is there maybe a edge case leading to this point that could be prevented?
Please let me know any additional information you might need! Thanks a lot!