shawwn / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
MIT License
109 stars 36 forks source link

Can't load fine-tuned checkpoint #11

Open klimentij opened 4 years ago

klimentij commented 4 years ago

Hi and thank you for this great fork! I fine-tuned GPT2-XL on TPU and now trying to load it to run inference. interactive_conditional_samples.py seems to fail loading the model (TF 1.15, GPU T4):

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
  (0) Failed precondition: Attempting to use uninitialized value model/h36/mlp/c_proj/w
     [[{{node model/h36/mlp/c_proj/w/read}}]]
     [[sample_sequence/while/Exit_3/_203]]
  (1) Failed precondition: Attempting to use uninitialized value model/h36/mlp/c_proj/w
     [[{{node model/h36/mlp/c_proj/w/read}}]]
0 successful operations.
0 derived errors ignored.

Any ideas?

Are there any other ways to load the model from this hdf5 format and make a regular checkpoint out of it?

Just trying not to loose 3 days of TPU compute :)