How to convert WideResNet from Tensorflow to Pytorch

uriyapes commented 5 years ago

I am trying to convert WideResNet model from https://github.com/MadryLab/cifar10_challenge to pytorch. I am unsure about the process of extracting the model, according to my understanding I need to edit extractor.py( https://github.com/Microsoft/MMdnn/blob/master/mmdnn/conversion/examples/tensorflow/extractor.py) and add my architecture to the architecture_map class and then call handle_checkpoint(architecture = "my_model", path = "path_to_model_arch.ckpt"). Could you confirm? Another issue is when I am running the conversion code without doing the extraction part I get error (using the normal .meta and .index files) - mmconvert -sf tensorflow -in ./model_0/checkpoint-70000.meta -iw ./model_0/checkpoint-70000.index --inNodeName input/Placeholder --inputShape 32,32,3 --dstNode logit/xw_plus_b -df pytorch -om Madry.pth

Platform (ubuntu 18.04):

Python version: 3.6

Source framework with version: Tensorflow 1.4.1 with GPU:

Destination framework with version Pytorch 1.1.0 with GPU:

Pre-trained model path: https://www.dropbox.com/s/ywc0hg8lr5ba8zd/secret.zip?dl=1

Log: Parse file [./model_0/checkpoint-70000.meta] with binary format successfully. Tensorflow model file [./model_0/checkpoint-70000.meta] loaded successfully. Tensorflow checkpoint file [./model_0/checkpoint-70000.index] loaded successfully. [0] variables loaded. WARNING:tensorflow:From /home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/tensorflow/python/tools/strip_unused_lib.py:86: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and wil$ be removed in a future version. Instructions for updating: Use tf.compat.v1.graph_util.extract_sub_graph 2019-07-11 13:26:00.740033: I tensorflow/tools/graph_transforms/transform_graph.cc:317] Applying fold_constants 2019-07-11 13:26:01.077491: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-07-11 13:26:01.114085: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1992000000 Hz 2019-07-11 13:26:01.117283: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7fffc3053f60 executing computations on platform Host. Devices: 2019-07-11 13:26:01.117540: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): , TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Enter]. TensorflowEmitter has not supported operator [TensorArrayV3] with name [input/map/TensorArray]. TensorflowEmitter has not supported operator [TensorArrayV3] with name [input/map/TensorArray_1]. TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Less/Enter]. TensorflowEmitter has not supported operator [Range] with name [input/map/TensorArrayUnstack/range]. TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayReadV3/Enter]. TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Enter_1]. TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayWrite/TensorArrayWriteV3/Enter]. TensorflowEmitter has not supported operator [TensorArrayScatterV3] with name [input/map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3]. TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayReadV3/Enter_1]. IR network structure is saved as [18107b4125234bc6875b19c426de9f47.json]. IR network structure is saved as [18107b4125234bc6875b19c426de9f47.pb]. IR weights are saved as [18107b4125234bc6875b19c426de9f47.npy]. Parse file [18107b4125234bc6875b19c426de9f47.pb] with binary format successfully.
Traceback (most recent call last): File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/emitter.py", line 37, in _load_weights self.weights_dict = np.load(file_name).item() File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/numpy/lib/npyio.py", line 447, in load pickle_kwargs=pickle_kwargs) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/numpy/lib/format.py", line 696, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/bin/mmconvert", line 10, in sys.exit(_main()) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/_script/convert.py", line 108, in _main ret = IRToCode._convert(code_args) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/_script/IRToCode.py", line 42, in _convert emitter = PytorchEmitter((args.IRModelPath, args.IRWeightPath)) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/pytorch/pytorch_emitter.py", line 41, in init self._load_weights(weight_path) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/emitter.py", line 39, in _load_weights self.weights_dict = np.load(file_name, encoding='bytes').item() File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/numpy/lib/npyio.py", line 447, in load pickle_kwargs=pickle_kwargs) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/numpy/lib/format.py", line 696, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False Error

JiahaoYao commented 5 years ago

Hi @uriyapes, first you can downgrade pytorch to 0.4.0 to have a try. Then, you will get the following outputs.

TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Enter].
TensorflowEmitter has not supported operator [TensorArrayV3] with name [input/map/TensorArray].
TensorflowEmitter has not supported operator [TensorArrayV3] with name [input/map/TensorArray_1].
TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Less/Enter].
TensorflowEmitter has not supported operator [Range] with name [input/map/TensorArrayUnstack/range].
TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayReadV3/Enter].
TensorflowEmitter has not supported operator [Enter] with name [input/map/while/Enter_1].
TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayWrite/TensorArrayWriteV3/Enter].
TensorflowEmitter has not supported operator [TensorArrayScatterV3] with name [input/map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3].
TensorflowEmitter has not supported operator [Enter] with name [input/map/while/TensorArrayReadV3/Enter_1].
IR network structure is saved as [3127a537335f49ca868351cc64055d1c.json].
IR network structure is saved as [3127a537335f49ca868351cc64055d1c.pb].
IR weights are saved as [3127a537335f49ca868351cc64055d1c.npy].
Parse file [3127a537335f49ca868351cc64055d1c.pb] with binary format successfully.
Target network code snippet is saved as [Madry.py].
Target weights are saved as [3127a537335f49ca868351cc64055d1c.npy].
PyTorch model file is saved as [Madry.pth], generated by [Madry.py] and [3127a537335f49ca868351cc64055d1c.npy]. Notice that you may need [Madry.py] to load the model back.

The model conversion does not succeed. When you visualize the model, it has the parts hard to convert.

I mean that is due to the preprocessing in the tensorflow.

My suggestion is that you dump all the parameters except from the preprocessing layers. Probably you should reload the tensorflow model and dump the model from https://github.com/MadryLab/cifar10_challenge/blob/master/model.py#L42 on. Then, the remaining parts are just normal conv-networks. They should be easily converted to pytorch. Finally, after the model is already converted to pytorch. You should find some equivalent preprocessing function in pytorch. Apply that in the beginning and then connect to the converted pytorch model.

Hope that works for you.

uriyapes commented 5 years ago

Hi @JiahaoYao, thanks for the quick replay. I have tried to follow your suggestions but unfortunately, I still get an error. First I downgraded my pytorch to 0.4.0. Then following your suggestion I have removed the preprocessing input layer and restored the the parameters into the adjusted model using the original .ckpt file, as you can see in the graph: The adjusted model can be downloaded via: https://www.dropbox.com/s/2etiig0fhj1dsac/model_fix.zip?dl=0

Running the script - mmconvert -sf tensorflow -in ./model_fix/m.meta -iw ./model_fix/m.index --inNodeName input/Placeholder --inputShape 32,32,3 --dstNode logit/xw_plus_b -df pytorch -om Madry.pth

Gave me the following output: Parse file [./model_fix/m.meta] with binary format successfully. Tensorflow model file [./model_fix/m.meta] loaded successfully. Tensorflow checkpoint file [./model_fix/m.index] loaded successfully. [0] variables loaded. WARNING:tensorflow:From /home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/tensorflow/python/tools/strip_unused_lib.py:86: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use tf.compat.v1.graph_util.extract_sub_graph 2019-07-16 11:28:59.103334: I tensorflow/tools/graph_transforms/transform_graph.cc:317] Applying fold_constants 2019-07-16 11:28:59.174171: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-07-16 11:28:59.178872: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1992000000 Hz 2019-07-16 11:28:59.179921: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7fffc4b0bc30 executing computations on platform Host. Devices: 2019-07-16 11:28:59.180107: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): , Traceback (most recent call last): File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/bin/mmconvert", line 10, in sys.exit(_main()) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/_script/convert.py", line 102, in _main ret = convertToIR._convert(ir_args) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 115, in _convert parser.run(args.dstPath) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/parser.py", line 22, in run self.gen_IR() File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 424, in gen_IR func(current_node) File "/home/q/DevPrograms/miniconda3/envs/deep_pnml/lib/python3.6/site-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 548, in rename_Conv2D self.set_weight(source_node.name, 'weights', self.ckpt_data[W.name]) KeyError: 'input/init_conv/DW'

yyyyxie commented 4 years ago

HI, do you finally fix it?

microsoft / MMdnn

How to convert WideResNet from Tensorflow to Pytorch #693