ConnorJL / GPT2

An implementation of training for GPT2, supports TPUs
MIT License
1.42k stars 338 forks source link

117M/model.ckpt.index is corrupted? #29

Open ksjae opened 3 years ago

ksjae commented 3 years ago

Kept getting this error -

Create CheckpointSaverHook.
Done calling model_fn.
TPU job name worker
Graph was finalized.
Restoring parameters from gs://kogpt2/models/117M/model.ckpt
Error recorded from training_loop: From /job:worker/replica:0/task:0:
File contents are inconsistent for file: gs://kogpt2/models/117M/model.ckpt.index @ 0.
         [[node save/RestoreV2 (defined at /home/ksjcom0705_gmail_com/GPT2/venv/lib/python3.7/site-packages/tensorflow_co
re/python/framework/ops.py:1748) ]]

Anyone with a trained 117M model so I can pretrain them? It looks like the source is damaged somehow(or gsutil is damaging them)