Open yansir-X opened 4 years ago
Hello @Yangsir-X,
By any chance, did you train your model with the --num_filters
option ?
If so, you also need to add the same --num_filters
option for compression and decompression.
Best,
Thanks for answering. no, I didn't use the --num_filters option. I searched online and it points me to sth like: tf.reset_default_graph() But I don't know where I should add that. Or if I should do sth else.
I'm really stuck at this problem. Best
I am not sure this would solve your problem.
From the error, it seems that the number of filters in the trained model is different in training and compression.
[128]
and [32]
seems like a shape mismatch for a bias variable.
Did you increase the number of filters in compression_model.py
?
Also, maybe you are still using the path to the old model instead of the new one when using compress.py
?
Can you give me the commands you used before encountering this error ? Also, what changes did you make to the model ?
The only thing where i changed sth is in analysis_transform and synthelsis_transform, where i added batchnormlization and another conv and deconv layer:
def analysis_transform(tensor, num_filters, data_format): with tf.variable_scope("analysis"): with tf.variable_scope("layer_0"): layer = tf.layers.Conv3D( num_filters, (9, 9, 9), strides=(2, 2, 2), padding="same", use_bias=True, activation=tf.nn.relu, data_format=data_format) tensor = layer(tensor)
**with tf.variable_scope("bnorm"):
layer = tf.layers.BatchNormalization(axis=2, momentum=0.99, epsilon=0.001, center=True, scale=True,
beta_initializer='zeros', gamma_initializer='ones',
moving_mean_initializer='zeros', moving_variance_initializer='ones',
beta_regularizer=None, gamma_regularizer=None, beta_constraint=None,
gamma_constraint=None, renorm=False, renorm_clipping=None, renorm_momentum=0.99,
fused=None, trainable=True, virtual_batch_size=None, adjustment=None, name=None)
tensor = layer(tensor)**
with tf.variable_scope("layer_1"):
layer = tf.layers.Conv3D(
num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same",
use_bias=True, activation=tf.nn.relu, data_format=data_format)
tensor = layer(tensor)
with tf.variable_scope("layer_2"):
layer = tf.layers.Conv3D(
num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same",
use_bias=False, activation=tf.nn.relu, data_format=data_format)
tensor = layer(tensor)
with tf.variable_scope("layer_3"):
layer = tf.layers.Conv3D(
num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same",
use_bias=False, activation=None, data_format=data_format)
tensor = layer(tensor)
return tensor
def synthesis_transform(tensor, num_filters, data_format): with tf.variable_scope("synthesis"): with tf.variable_scope("layer_0"): layer = tf.layers.Conv3DTranspose( num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same", use_bias=True, activation=tf.nn.relu, data_format=data_format) tensor = layer(tensor)
with tf.variable_scope("layer_1"):
layer = tf.layers.Conv3DTranspose(
num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same",
use_bias=True, activation=tf.nn.relu, data_format=data_format)
tensor = layer(tensor)
with tf.variable_scope("layer_2"):
layer = tf.layers.Conv3DTranspose(
num_filters, (5, 5, 5), strides=(2, 2, 2), padding="same",
use_bias=True, activation=tf.nn.relu, data_format=data_format)
tensor = layer(tensor)
with tf.variable_scope("layer_3"):
layer = tf.layers.Conv3DTranspose(
1, (9, 9, 9), strides=(2, 2, 2), padding="same",
use_bias=True, activation=tf.nn.relu, data_format=data_format)
tensor = layer(tensor)
return tensor
I think the issue is that axis
should be 1
(the channels axis) instead of 2
since the model is in channels_first
mode.
Also, maybe you are still using the path to the old model instead of the new one when using compress.py ? I am using the new model for compress.py
Can you give me the commands you used before encountering this error ? What do you mean? It's just: python train.py "../data/ModelNet40_pc_64/*/.ply" ../models/Model256_new --resolution 64 --lmbda 0.000001 python compress.py ../data/m40/ "*/.ply" ../data/msft_bin_256new ../models/Model256_new --resolution 256
Also, maybe you are still using the path to the old model instead of the new one when using compress.py ? I am using the new model for compress.py
Can you give me the commands you used before encountering this error ? What do you mean? It's just: python train.py "../data/ModelNet40_pc_64/*/.ply" ../models/Model256_new --resolution 64 --lmbda 0.000001 python compress.py ../data/m40/ "*/.ply" ../data/msft_bin_256new ../modelss/Model256_new --resolution 256
Ok, so I am pretty sure that the axis
is the issue.
After the first convolution, 256 / 2 = 128
(256
the compress resolution) is in conflict with 64 / 2 = 32
(64
the training resolution).
BatchNorm should be on axis=1
(channels axis) as the model is in channels_first
mode, not on axis=2
(a spatial axis).
You are right, Mr. Quach! Thanks for your help!
No problem, happy to help!
Dear Mr. Quach, I made some changes to synthesis_transform and analyesis_transform in compression_model.py and retrained the model. But when I run the compress.py using the newly retrained model, the following error occurs:
_InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [32] [[node save/Assign (defined at /home/stud_yang/.conda/envs/myenv/lib/python3.6/site-packages/tensorflowestimator/python/estimator/estimator.py:627) ]]
I guess somehow the old model gets involved? I've spent quite some time searching for a solution and trying, with no progress. Maybe you have any ideas?
Sorry to always bother you. :) Best