How to restore part of the model?

DrSleep / tensorflow-deeplab-resnet

DeepLab-ResNet rebuilt in TensorFlow

MIT License

1.25k stars 431 forks source link

How to restore part of the model? #11

Closed gaopeng-eugene closed 7 years ago

gaopeng-eugene commented 7 years ago

For example, I want to train 'fc1_voc12_c0' 'fc1_voc12_c1' 'fc1_voc12_c2' 'fc1_voc12_c3' from scratch. To do that, I changed the name to 'fc1_voc12_c0_random'. How can I load other weight while making sure 'fc1_voc12_c0_random' initialised from noise?

Directly change the name will cause errors.

gaopeng-eugene commented 7 years ago

I partly save the problem by using the following code

restore_var = tf.all_variables() restore_var.pop()

This is an ugly approach, can you suggest any efficient way about loading partial model.

gaopeng-eugene commented 7 years ago

By the way, what's the meaning for NUM_SAVE_IMAGES. The true batch size is BATCH_SIZE or NUM_SAVE_IMAGES * BATCH_SZIE.

DrSleep commented 7 years ago

You can have a list with names of variables that you don't want to restore, and then choose only those variables that are not in the list (without renaming). E.g.,

not_restore = ['fc1_voc12_c0', 'fc1_voc12_c1', 'fc1_voc12_c2', 'fc1_voc12_c3']
restore_var = [v for v in tf.all_variables() if v.name not in not_restore] # Keep only the variables, whose name is not in the not_restore list.

The true batch size is BATCH_SIZE. NUM_SAVE_IMAGES does not have anything to do with batch size (you only need NUM_SAVE_IMAGES be less or equal than NUM_SAVE_IMAGES, otherwise you won't have enough images in the batch to save).

parser.add_argument("--batch_size", type=int, default=BATCH_SIZE,
                    help="Number of images sent to the network in one step.")

parser.add_argument("--save_num_images", type=int, default=SAVE_NUM_IMAGES,
                    help="How many images to save.")

Ariel-JUAN commented 6 years ago

@DrSleep I restore a list with names of variables that I don't want to restore, like you said.

not_restore = ['fc1_voc12_c0', 'fc1_voc12_c1', 'fc1_voc12_c2', 'fc1_voc12_c3'] restore_var = [v for v in tf.global_variables() if v.name not in not_restore] I wonder how to define trainable variables. I write like this. trainable = [v for v in tf.trainable_variables() if v.name in not_restore] And I optimize the loss like this. optim=Adamoptizer.minimize(loss, var_list=trainable) But error occured，said no variables to optimize. can you give me some advice？Thanks！

DrSleep commented 6 years ago

you can print out the trainable list and see what you have there

Ariel-JUAN commented 6 years ago

@DrSleep The trainable list is empty, however, the restore_var list consists of all variables== I am confused....

DrSleep commented 6 years ago

check the names in tf.trainable_variables

On 28 November 2017 at 11:45, Ariel-JUAN notifications@github.com wrote:

@DrSleep https://github.com/drsleep The trainable list is empty, however, the restore_var list consists of all variables== I am confused....

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DrSleep/tensorflow-deeplab-resnet/issues/11#issuecomment-347381693, or mute the thread https://github.com/notifications/unsubscribe-auth/AHemmBS9dgPIa63BSet1d7yHlYLyy7h9ks5s617HgaJpZM4LmgZv .