Closed QING0304 closed 2 years ago
Hi, May I ask what the version of edgetpu_compiler and run time you used Thanks!
@QING0304 if mAP dropped from .pb
-> .tflite
, this seems like a tflite conversion issue rather than the compiler, the tensorflow team would give you a more appropriate answer for this. Although ssd mobilenet is proven, so I'm very surprised to see this issue.
May I know the full command you use to train the model also (any changes to num steps)?
Hi, May I ask what the version of edgetpu_compiler and run time you used Thanks!
@DLMasterCat Hi, edgetpu_compiler version: 14.1.317412892 version of edgetpu api on TPU: 2.14.1 version of tflite-runtime on TPU: 2.1.0.post1
Any suggestions would be appreciated!
@QING0304 if mAP dropped from
.pb
->.tflite
, this seems like a tflite conversion issue rather than the compiler, the tensorflow team would give you a more appropriate answer for this. Although ssd mobilenet is proven, so I'm very surprised to see this issue. May I know the full command you use to train the model also (any changes to num steps)?
@Namburger Hi, I used this script https://github.com/tensorflow/models/blob/master/research/object_detection/legacy/train.py for training the model and set parameters --pipeline_config_path and --train_dir to corresponding path.
Content of pipeline.config is as follows:
model {
ssd {
num_classes: 1
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v2'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 3
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 16
data_augmentation_options {
random_adjust_brightness {
}
}
data_augmentation_options {
random_pixel_value_scale {
}
}
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.01
decay_steps: 4000
decay_factor: 0.5
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "/ext/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/model.ckpt"
from_detection_checkpoint: true
num_steps: 40000
load_all_detection_checkpoint_vars: true
}
train_input_reader {
label_map_path: "/ext/SSD.pbtxt"
tf_record_input_reader {
input_path: "/ext/xxx.tfrecords"
}
}
#eval_config: {
# num_examples: 550
#}
#
#eval_input_reader: {
# tf_record_input_reader {
# input_path: ""
# }
# label_map_path: ""
# shuffle: false
# num_readers: 1
#}
graph_rewriter {
quantization {
delay: 0
weight_bits: 8
activation_bits: 8
}
}
Oh, could you try model_main.py
instead? legacy/main.py
is deprecated.
python3 object_detection/model_main.py --model_dir <path> --pipeline_config <path>
Oh, could you try
model_main.py
instead?legacy/main.py
is deprecated.python3 object_detection/model_main.py --model_dir <path> --pipeline_config <path>
Thanks for suggestion! I will try it and share the result as soon as possible.
Also, I wonder if these flags were necessary, I just realized this?
--default_ranges_min=0 \
--default_ranges_max=6 \
usually that's needed if you trained without the quantization re-writer, wonder what would happens if you take that off?
@Namburger Thanks for reply again!
Following your suggestions, I tried to use model_main.py
to train MobileNet V2 model with the same pipeline.config and take off --default_ranges_min
and --default_ranges_max
when convert .pb to .tflite. The result is the same as before, the mAP still decreased by about 20% on Android and TPU.
In addition, I used random_adjust_brightness
and random_pixel_value_scale
data augementation during training because of the feature of my own dataset. Is it possible that something wrong with it on TPU?
@QING0304 Well, you see there are 2 steps from getting your model from a tensorflow graph file to edgetpu compatible:
inference_graph.pb -> (1) tflite_converter -> model.tflite (cpu) -> (2) edgetpu_compiler -> model_edgetpu.tflite (edgetpu)
The way I understood the problem is that the model's mAP decreased about 20% after step 1, correct? In that case you should be seeing the issue in the model.tflite (cpu) too, could you try evaluating that model?
I am experiencing a similar issue. I trained a ssd mobilenetv2 model using research/object_detection/model_main.py. The tflite model works fine on CPU. However the edgetpu version produces much worse results. I used the following command to convert to tflite:
tflite_convert --graph_def_file=tflite_graph.pb --output_file=detect.tflite \ --input_shapes=1,300,300,3 --input_arrays=normalized_input_image_tensor \ --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \ --inference_type=FLOAT --allow_custom_ops
And the following script to convert to edgetpu model:
converter = lite.TFLiteConverter.from_frozen_graph( "tflite_graph.pb", ["normalized_input_image_tensor"], ["TFLite_Detection_PostProcess", "TFLite_Detection_PostProcess:1", "TFLite_Detection_PostProcess:2", "TFLite_Detection_PostProcess:3"], {"normalized_input_image_tensor": [1, 300, 300, 3]}) converter.allow_custom_ops = True converter.optimizations = [lite.Optimize.DEFAULT] converter.output_format = lite_constants.TFLITE def _representative_data_gen(): for example in tfrecord.parse([train.tfrecord-00000-of-00100']): encoded = example["image/encoded"].numpy() image = tf.io.decode_jpeg(encoded, channels=3) image = tf.image.resize(image, [flags.width, flags.height]) image = tf.cast(image, tf.float32) image = image / 255. image = tf.expand_dims(image, 0) yield [image] converter.representative_dataset = _representative_data_gen tflite_model_quant = converter.convert()
OK, I found the issue.
image = image / 255.
should be
image = image / 255. - 0.5
@QING0304 are you still having any issues here?
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.
Hello,
I trained a SSD MobileNet V2 model on my own datasets for object detection. The mAP of .pb model that evaluated on PC is 72%. However, when I evaluated the .tflite model on Android and compiled .tflite model on TPU, the mAP both decreased to about 51%.
First, I referenced https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_training_and_evaluation.md to do quantization-aware training on SSD MobileNet V2 model, with pipeline config https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v2_quantized_300x300_coco.config and pre-trained model http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz. The parameter _quantdelay was set to 0 that inspired by https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/quantize/create_training_graph. Whole training process run on RTX 2080Ti GPU with Tensorflow 1.15 code.
I converted ckpt to tflite by using the following method:
Finally, I compiled .tflite model into the model that is compatible with the Edge TPU by using method https://coral.ai/docs/edgetpu/compiler/#usage. The evaluate code was rewritten from https://github.com/google-coral/edgetpu/blob/master/examples/object_detection.py
That really makes me confused why the mAP decreased by 20% on Android and TPU. Is there any problem in the process that I did above?