Retraining mobilenetv2 and compiling for vision bonnet

Hi,

I've been able to successfully retrain mobilenetv2 models using a modified version of the retrain.py script provided with the tensorflow for poets.

When attempting to optimize those graphs with optimize_for_inference.py I get warnings of the following type: W0321 21:40:41.069447 140397580433216 optimize_for_inference_lib.py:244] Didn't find expected Conv2D input to 'MobilenetV2/expanded_conv_16/depthwise/BatchNorm/FusedBatchNorm'

I am curious if using the graph transform tool would work better for this? If so, are there examples of doing so?

The real problem comes when I try to run the bonnet model compiler. I use the following command line: ./scripts/bonnet_model_compiler.par --frozen_graph_path=tf_files/pbmodels/retrained_graph_mobilenet_v2_1.0_224.pb --output_graph_path=tf_files/binaryprotos.usda_mobilenet_v2_1.0_224.binaryproto --input_tensor_name=input --output_tensor_names=final_result --input_tensor_size=224 --debug

and get the following seg fault: W0321 22:05:42.662239 W0321 22:05:42.662446 W0321 22:05:42.662486 W0321 22:05:42.662509 W0321 22:05:42.662527 W0321 22:05:42.662545 W0321 22:05:42.662561 W0321 22:05:42.662578 W0321 22:05:42.662594 W0321 22:05:42.662610 W0321 22:05:42.662627 W0321 22:05:42.662643 W0321 22:05:42.662660 W0321 22:05:42.662681 W0321 22:05:42.662697 W0321 22:05:42.662712 W0321 22:05:42.662727 W0321 22:05:42.662752 F0321 22:05:45.083605 @ 0x68ae7f (unknown) @ 0x68b664 (unknown) @ 0x68d3e9 (unknown) @ 0x4883df (unknown) @ 0x48918f (unknown) @ 0x406a43 (unknown) @ 0x4033d9 (unknown) @ 0x729a01 (unknown) @ 0x4005a9 (unknown) I0321 22:05:45.084316 I0321 22:05:45.084333 I0321 22:05:45.084344 I0321 22:05:45.084382 I0321 22:05:45.084408 I0321 22:05:45.084735 W0321 22:05:45.084995 W0321 22:05:45.085007 W0321 22:05:45.085015 W0321 22:05:45.085024 W0321 22:05:45.085031 I0321 22:05:45.085047 I0321 22:05:45.085085 I0321 22:05:45.085103 I0321 22:05:45.085108 I0321 22:05:45.085120 F0321 22:05:45.083605 E0321 22:05:45.085134 I0321 22:05:45.085140 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_1/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_2/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_3/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_4/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_5/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_6/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_7/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_8/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_9/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_10/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_11/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_12/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_13/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_14/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_15/add of operator Conv has values that may be too large: graph missing ReLU? 7104 model_preprocessing_utils.cc:186] Input MobilenetV2/expanded_conv_16/project/BatchNorm/FusedBatchNorm of operator Conv has values that may be too large: graph missing ReLU? 7104 preprocess_mognet_model_for_myriad_main.cc:700] Graph has non-fatal issues: /tmp/tmps0XkUm/temp_graph.bp, errors=1 7104 allocate_memory_on_myriad.cc:307] Check failed: DefragmentAndAllocate(op_output_tensor_name, signed_output_size_bytes, &output_tensor_offset, &move_ops_added_by_defragmentation, cg_proto) Not enough primary memory! Check failure stack trace: SIGABRT received by PID 7104 (TID 7104) from PID 7104; 7104 process_state.cc:293] RAW: ExecuteFailureCallbacks() safe 7104 process_state.cc:1175] RAW: FailureSignalHandler(): starting unsafe phase 7104 coreutil.cc:276] RAW: Attempting to connect to coredump socket @core 7104 coreutil.cc:279] RAW: Failed to connect to coredump socket @core 7104 coreutil.cc:208] RAW: Attempting to dump core 7104 coreutil.cc:244] RAW: WriteCoreDumpWith returns: 0 7104 process_state.cc:1213] --- CPU registers: --- 7104 process_state.cc:1213] r8=5c945f19 r9=7fffffffffffffff r10=8 r11=206 r12=7f533d263100 7104 process_state.cc:1213] r13=7f533e1e0508 r14=1190890 r15=1190950 rdi=1bc0 rsi=1bc0 rbp=7ffd8c404d60 7104 process_state.cc:1213] rbx=1190890 rdx=6 rax=0 rcx=716f3d rsp=7ffd8c404d60 rip=716f3d efl=206 7104 process_state.cc:1213] cgf=2b000000000033 err=0 trp=0 msk=fffffffe10000000 cr2=0 7104 process_state.cc:526] --- Memory map: --- 7104 process_state.cc:526] 00400000-00937000: /tmp/tmps0XkUm/tool_b.bin 7104 process_state.cc:526] 7ffd8c518000-7ffd8c51a000: [vdso] 7104 process_state.cc:526] ffffffffff600000-ffffffffff601000: [vsyscall] 7104 process_state.cc:297] RAW: ExecuteFailureCallbacks() unsafe 7104 allocate_memory_on_myriad.cc:307] Check failed: DefragmentAndAllocate(op_output_tensor_name, signed_output_size_bytes, &output_tensor_offset, &move_ops_added_by_defragmentation, cg_proto) Not enough primary memory! 7104 process_state.cc:679] RAW: Raising signal 6 with default behavior 7104 process_state.cc:1275] RAW: FailureSignalHandler() exiting

I saw in issue 403 https://github.com/google/aiyprojects-raspbian/issues/402#issuecomment-396874585 that there are mobilenetv2 models that have been released for the AIY vision bonnet. Are there any instructions for how these were (re)trained, optimized and compiled for the vision bonnet?

Many thanks, Chris

google / aiyprojects-raspbian

Retraining mobilenetv2 and compiling for vision bonnet #594