Open yissachar opened 5 years ago
In a new cell, copy paste the above to fine tune further
Try to restart the Python session. #77
From the README:
NB: Restart the Python session first if you want to finetune on another dataset or load another model.
From the notebook:
IMPORTANT NOTE: If you want to rerun this cell, restart the VM first (Runtime -> Restart Runtime). You will need to rerun imports but not recopy files.
Thanks, that is what I did to workaround this, but it seems like it would be desirable to allow users to re-finetune without restating. Is there some fundamental limitation that prevents this?
Also, I had read the README but it wasn't clear to me that re-finetuning on the same model was covered by this - perhaps the wording can be tweaked to make this clearer?
it wasn't clear to me that re-finetuning on the same model was covered by this - perhaps the wording can be tweaked to make this clearer?
I agree. The README makes it sound like one does not have to restart the VM if the dataset is identical.
It could be changed to:
NB: Restart the Python session first if you want to finetune further.
Agree that a README change would be more clear (my use case for retraining on the same dataset is through the CLI which refreshes the session; hadn't considered the Colab notebook use case).
I'll push a change today.
Thanks, that is what I did to workaround this, but it seems like it would be desirable to allow users to re-finetune without restating. Is there some fundamental limitation that prevents this?
It's more-or-less due to how TensorFlow works and I'm not skilled enough with low-level TF to find a workaround.
However, I think I can add a reset
function to avoid reloading the notebook, as the implementations used in the Cloud Run APIs reset correctly.
Try adding tf.reset_default_graph()
before each fine-tuning session. This works for me to continue fine-tuning:
import tensorflow as tf
# ...
tf.reset_default_graph()
sess = gpt2.start_tf_sess()
gpt2.finetune(sess,
'dataset.txt',
model_name='345M',
steps=10)
In my case I put it here
tf.reset_default_graph()
if not sess:
sess = gpt2.start_tf_sess()
else:
sess = gpt2.reset_session(sess)
gpt2.load_gpt2(sess, run_name=run_name)
and it perfectly worked! Thanks!
Seeing the same error as outlined in https://github.com/minimaxir/gpt-2-simple/issues/12, however I am on 0.5.3.
Generate the first time:
It works fine. In a new cell, copy paste the above to fine tune further but get an error about model/wpe already existing. I tried explicitly setting
restore_from='latest'
even though that seems to be the default, and it didn't help.