taki0112 / StyleGAN-Tensorflow

Simple & Intuitive Tensorflow implementation of StyleGAN (CVPR 2019 Oral)
MIT License
211 stars 60 forks source link

Test fails to restore from pretrained model checkpoint linked in README #21

Open jpgard opened 3 years ago

jpgard commented 3 years ago

Thanks for the great repo @taki0112 .

I am trying to use the pretrained model located here and linked in the README. I'm able to instantiate the model with .build_model(), and can see the expected output when I run show_all_variables(), but when I attempt to run .test() on the model, it fails to restore from the provided checkpoint.

It seems that there is a mismatch between the variables in the checkpoint, and in the model. I think it is possible that the hyperparameters or architecture I am using (I am using all of the default values from train.py may not match those for which the model was trained, or variables have been renamed in the model saved in the checkpoint vs. the model instantiated by the github code? There is no information about the pretrained model itself, so I do not know.

Here is the output of my model, printed by the StyleGAN class on instantiation:

##### Information #####
# dataset :  FFHQ
# dataset number :  99
# gpu :  1
# batch_size in train phase :  OrderedDict([(4, 128), (8, 128), (16, 128), (32, 64), (64, 32), (128, 16), (256, 8), (512, 4), (1024, 4)])
# batch_size in test phase :  1
# start resolution :  8
# target resolution :  1024
# iteration per resolution :  1200000
# progressive training :  True
# spectral normalization :  False

Partial stack trace is shown below -- thanks for any suggestions you could provide.

gan.test()

 [*] Reading checkpoints...
INFO:tensorflow:Restoring parameters from /jpgard/StyleGAN-Tensorflow/checkpoint/StyleGAN_FFHQ_8to1024_progressive/StyleGAN.model-224999
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args)
   1364     try:
-> 1365       return fn(*args)
   1366     except errors.OpError as e:

~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1349       return self._call_tf_sessionrun(options, feed_dict, fetch_list,
-> 1350                                       target_list, run_metadata)
   1351 

~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1442                                             fetch_list, target_list,
-> 1443                                             run_metadata)
   1444 

NotFoundError: 2 root error(s) found.
  (0) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
     [[{{node save/RestoreV2}}]]
  (1) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
     [[{{node save/RestoreV2}}]]
     [[save/RestoreV2/_3385]]
0 successful operations.
0 derived errors ignored.

...

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in restore(self, sess, save_path)
   1299       try:
-> 1300         names_to_keys = object_graph_key_mapping(save_path)
   1301       except errors.NotFoundError:

~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in object_graph_key_mapping(checkpoint_path)
   1617   reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
-> 1618   object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
   1619   object_graph_proto = (trackable_object_graph_pb2.TrackableObjectGraph())

~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py in get_tensor(self, tensor_str)
    914 
--> 915       return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str))
    916 

NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
<ipython-input-16-822286d4027f> in <module>
      1 # os.listdir(os.path.join(gan.checkpoint_dir, gan.model_dir))
----> 2 gan.test()

~/StyleGAN-Tensorflow/stylegan_tf/StyleGAN.py in test(self)
    554 
    555         self.saver = tf.train.Saver()
--> 556         could_load, checkpoint_counter = self.load(self.checkpoint_dir)
    557         result_dir = os.path.join(self.result_dir, self.model_dir)
    558         check_folder(result_dir)

~/StyleGAN-Tensorflow/stylegan_tf/StyleGAN.py in load(self, checkpoint_dir)
    542         if ckpt and ckpt.model_checkpoint_path:
    543             ckpt_name = os.path.basename(ckpt.model_checkpoint_path)
--> 544             self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name))
    545             counter = int(ckpt_name.split('-')[-1])
    546             print(" [*] Success to read {}".format(ckpt_name))

~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in restore(self, sess, save_path)
   1304         # a helpful message (b/110263146)
   1305         raise _wrap_restore_error_with_msg(
-> 1306             err, "a Variable name or other graph key that is missing")
   1307 
   1308       # This is an object-based checkpoint. We'll print a warning and then do

NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

2 root error(s) found.
  (0) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
     [[node save/RestoreV2 (defined at /homes/gws/jpgard/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
     [[node save/RestoreV2 (defined at /homes/gws/jpgard/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
     [[save/RestoreV2/_3385]]
0 successful operations.
0 derived errors ignored.