walsvid / Pixel2MeshPlusPlus

Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation. In ICCV2019.
https://arxiv.org/abs/1908.01491
BSD 3-Clause "New" or "Revised" License
356 stars 56 forks source link

"ValueError: No variables to save" when loading pre-trained CNN #19

Closed topinfrassi01 closed 3 years ago

topinfrassi01 commented 3 years ago

As the title states, I'm trying to run train_p2mpp.py and I have a problem when it comes to loading the pre-trained CNN from checkpoint.

With the original configuration, where I know the path is good :

  pre_trained_cnn_path: dir/models/coarse_mvp2m
  cnn_step: 50

Inside the load_cnn function of MeshNet :

    def loadcnn(self, sess=None, ckpt_path=None, step=None):
        if not sess:
            raise AttributeError('TensorFlow session not provided.')

        variables_to_restore = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='meshnetmvp2m/cnn')
        var_list = {var.name: var for var in variables_to_restore}
        saver = tf.train.Saver(var_list)
        save_path = os.path.join(ckpt_path, '{}.ckpt-{}'.format(self.name, step))
        saver.restore(sess, save_path)
        print('=> !!CNN restored from file: {}, epoch {}'.format(save_path, step))

With the original configuration, I have the following stacktrace :

=> load data
=> initialize session
=> load pre-trained cnn
Traceback (most recent call last):
  File "train_p2mpp.py", line 154, in <module>
    main(args)
  File "train_p2mpp.py", line 100, in main
    model.loadcnn(sess=sess, ckpt_path=cfg.p2mpp.pre_trained_cnn_path, step=cfg.p2mpp.cnn_step)
  File "/pix2meshpp/source/modules/models_p2mpp.py", line 109, in loadcnn
    saver = tf.train.Saver(var_list)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 832, in __init__
    self.build()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 844, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 869, in _build
    raise ValueError("No variables to save")
ValueError: No variables to save

Here's what I've tried, that didn't work :

  1. Changing meshnet to meshnetmvp2m/cnn as proposed in this issue
  2. Changing self.name to meshnetmvp2m, as this is the name of the ckpt file inside the coarse_mvp2m folder

I have noticed, when printing the variables of MeshNetMVP2M, that the variable name and scope fits the bill of what's written above : meshnetmvp2m/cnn/*.

However, I've made it work by loading the CNN from refine_p2mpp by changing the configuration file to :

pre_trained_cnn_path: dir/models/refine_p2mpp
cnn_step: 10

Which feels "hacky" since I'm trying to re-train that same network.

Am I missing something or is there a bug where it seems like load_cnn is written to load from refine_p2mpp but the configuration file is written to load from coarse_mvp2m?

If I figure out the issue before I have feedback I'll push a PR.

Thanks for your support!

walsvid commented 3 years ago

Hi, @topinfrassi01. Is the version of TensorFlow you are using the same as in the readme? Since the early APIs of TensorFlow are very confusing, inconsistent versions can cause some problems. When I released the code, I tested the complete pipeline and there was no error.

topinfrassi01 commented 3 years ago

Well that certainly is the problem then. I'm working on tf 1.13. Thanks!