openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

interactive_conditional_samples.py crashes if there is more than one context token #306

Open Nicholas-Markley opened 2 years ago

Nicholas-Markley commented 2 years ago

I can run the generate_unconditional_samples.py script on my GPU without issue, however, when I run the interactive_conditional_samples.py script, it crashes if there is more than one context token.

The interactive_conditional_samples.py script works fine as long as the model prompt only produces one context token, for instance using the prompt "please" produces the list of tokens [29688] and correctly generates text. However, it crashes if the model prompt produces two or more context tokens, for instance using the prompt "pig" produces the list of tokens [79, 328] and crashes immediately.

When it crashes I'm getting the error: failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED

And a little further down I see:

Blas xGEMMBatched launch failed : a.shape=[25,2,64], b.shape=[25,2,64], m=2, n=2, k=64, batch_size=25
         [[{{node sample_sequence/model/h0/attn/MatMul}}]]
         [[sample_sequence/while/Exit_3/_1375]]

If anyone has any insight on what might be going wrong, and how I can fix it, I'd really appreciate the help.

huangh12 commented 1 year ago

Update The problem occurs with tf1.12/1.15 or tf2.0, but disappear with tf2.3.0


I also meet this problem. After hours of debuging, I find it seems like a bug of tf. Suppose you input three token, the model.py will calculate below like logics in w = tf.matmul(q, k, transpose_b=True), which is OK during network initialization but will crash when execute session run.

a = tf.random.uniform([1,12,3,64])
b = tf.random.uniform([1,12,3,64])
c = tf.matmul(a, b, transpose_b=True)
with tf.Session() as sess:
    print(sess.run(c).shape)