Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
https://pocketflow.github.io
Other
2.79k stars 490 forks source link

Incompatible shapes: [64,1024,1,1] vs. [128,1024,1,1] error #165

Open to-be-snail opened 5 years ago

to-be-snail commented 5 years ago

Because pruning with RL method takes long long time,I use uniform prune method. But when evaluating after pruning,I got this error Incompatible shapes: [64,1024,1,1] vs. [128,1024,1,1] error. Here is the report:

INFO:tensorflow:iter #10000: lr = 6.000000e-02 | loss = 1.254797e+00 | speed = 7114.82 pics / sec INFO:tensorflow:accuracy = 0.8515625 INFO:tensorflow:model saved to ./models/models_eval_mobilenet_v1_at_cifar10/models/-10000 INFO:tensorflow:Restoring parameters from ./models/models_eval_mobilenet_v1_at_cifar10/models-10000 INFO:tensorflow:model restored from ./models/models_eval_mobilenet_v1_at_cifar10/models-10000 [WARNING] TF-Plus & Horovod cannot be imported; multi-GPU training is unsupported Traceback (most recent call last): File "C:\Users\longy\AppData\Local\Continuum\anaconda3\envs\lab\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call return fn(*args) File "C:\Users\longy\AppData\Local\Continuum\anaconda3\envs\lab\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "C:\Users\longy\AppData\Local\Continuum\anaconda3\envs\lab\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [64,1024,1,1] vs. [128,1024,1,1] [[{{node model/MobilenetV1/Logits/Dropout_1b/dropout/mul}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/MobilenetV1/Logits/Dropout_1b/dropout/div, model/MobilenetV1/Logits/Dropout_1b/dropout/mul-1-TransposeNHWCToNCHW-LayoutOptimizer)]] [[{{node add_1/_281}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_614_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Can anyone help me?Thanks!

jiaxiang-wu commented 5 years ago

"mobilenet_v1_at_cifar10" seems like a new model helper? Does this model helper only crash when using the ChannelPrunedLearner, or does it also crash with other learners during evaluation?

to-be-snail commented 5 years ago

"mobilenet_v1_at_cifar10" seems like a new model helper? Does this model helper only crash when using the ChannelPrunedLearner, or does it also crash with other learners during evaluation?

Thanks for your reply。Yes,I have trained mobilenet_v1 on cifar10,I just copied mobilenet_at_ilsvrc and modified some training parameters to complete the training phase.At present I have just tried the ChannelPrunedLearner because my work is about ChannelPrune

jiaxiang-wu commented 5 years ago

Can you try FullPrecLearner (no compression)?

to-be-snail commented 5 years ago

OK,I will try now,thank you!

to-be-snail commented 5 years ago

Can you try FullPrecLearner (no compression)?

Oh,maybe I didn't explain clear.I have finished training on cifar10 dataset(no compression train and evaluate),then I was trying to prune my model and this error occured

to-be-snail commented 5 years ago

Can you try FullPrecLearner (no compression)?

It seems that after pruning,there are some differences between training model and evaluating model.

jiaxiang-wu commented 5 years ago

Okay, so can you post the training and evaluation models you generated?

to-be-snail commented 5 years ago
      Okay, so can you post the training and evaluation models you generated?

-10000.zip

INFO:tensorflow:model saved to ./models/models_eval_mobilenet_v1_at_cifar10/models/-10000 INFO:tensorflow:Restoring parameters from ./models/models_eval_mobilenet_v1_at_cifar10/models-10000 INFO:tensorflow:model restored from ./models/models_eval_mobilenet_v1_at_cifar10/models-10000

when pruning,the trained model has been saved to -10000 and eva model read from -10000

darren1231 commented 5 years ago

I have the same question. Any update?