person-reidentification-model-0103/107/200 (conversion)

dinara92 commented 4 years ago

Hi,

I have a question about new person-reidentification-model-0103/107/200 (based on OSNet).

I tried running models with the latest inference engine, however I get "cannot parse future versions" error. I understand it is not possible to directly use these models with the latest engine. Therefore, I have following questions: 1) When is the approximate date of the release of the new engine corresponding to the latest open_model_zoo develop branch?

2) I tried converting the pre-trained PyTorch models available here, according to conversion guidelines. When converting to .onnx format, I get following warning:

UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch. This operator might cause results to not match the expected results by PyTorch. ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).

According to Openvino release notes, the latest opset provided is opset10 (for onnx). I used the latest version, release 3.1, but there is no information about opset 11 support. Would that mean I would still need to wait for the latest inference engine for model conversion, as well?

If I continue with conversion, ignoring the above-mentioned warning, I am able to get .xml and .bin IR model files. However, the accuracy of that converted model (person-reidentification-retail-0200) is very poor.

vladimir-dudnik commented 4 years ago

an approximate date of OpenVINO 2019R4 release (the actual version name might change though) is somewhere at the end of January. I'll let @snosov1 to comment on accuracy aspect for person-reidentification-retail-0200 model

snosov1 commented 4 years ago

I think your best bet at this point is to convert the model from the original format with OpenVINO R3 (latest released version to date). It's your item 2. Supposedly, it should work fine with it.

@dinara92 Can you, please, share your testing protocol from which you derive that the accuracy is poor?

@DmitriySidnev do you have any ideas/comments ?

DmitriySidnev commented 4 years ago

Hi, all! @dinara92, do not worry about this warning. I use pytorch 1.3.1 and see a similar warning about Resize method and opset version. It is because in conversion we use early version of opset while current version of pytorch supports the latest and implementation of the method may be different (not necessary). We can use the latest opset but in this case model optimizer can not convert ONNX model to IR format because of the Resize method and using of 9th opset version resolves it. So, I use OpenVINO 2019 R3 and checked the whole process from loading shared weights for 0200 model in pytorch format to IR model, tested the result model on my dataset. Metrics are the same for pytorch and IR models. Could you please share your setup and versions of pytorch, OpenVINO? How do you understand that the quality is poor? What script do you run?

dinara92 commented 4 years ago

Thanks for the information, I appreciate fast response. I didn't test on the dataset, just mere inferencing different versions of reid models (from 2019R3 and R4) and comparing video outputs using multi_camera_multi_person_tracking.py demo. I will attach the output video files using each reid model (fp16/32) for your reference.

My setup is as follows: Openvino - the latest 2019R3, Pytorch - 1.4.0+cpu

Command for running demo: python3.5 multi_camera_multi_person_tracking.py -i 0025_0150.mp4 0030-00155.mp4 -m openvino/ir_models/person-detection-retail-0013/FP16/person-detection-retail-0013.xml --m_reid openvino/ir_models/converted_model/0200/FP16/person-reidentification-retail-0200.xml --config config.py -d GPU

Input files 0025_0150.mp4 and 0030-00155.mp4 are same samples, with 5 sec difference.

Here is the full warning log when trying to convert to .onnx model (acc. to this)

python3.5 convert_to_onnx.py --config config/person-reidentification-retail-0200.yaml --output-name ie_models/person-reidentification-retail-0200 --verbose

** The following layers are discarded due to unmatched keys or layer size: ['classifier.weight'] Building train transforms ...

random_grid

random_figures

random_padding

resize to 256x128

random flip

color jitter

random_gray_scale

random_rotate

to torch tensor of range [0, 1]

normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

random erase Building test transforms ...

resize to 256x128

to torch tensor of range [0, 1]

normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) .../openvino/openvino_training_extensions/pytorch_toolkit/person_reidentification/models/modules/fpn.py:70: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if lat.shape != top_blob.shape: .../openvino/openvino_training_extensions/pytorch_toolkit/person_reidentification/models/osnet_fpn.py:178: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! kernel_size = int(feature_pyramid[i].shape[2] // target_shape[0]) /usr/local/lib/python3.5/dist-packages/torch/onnx/symbolic_helper.py:246: UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch. ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode). We recommend using opset 11 and above for models using this operator. "" + str(_export_onnx_opset_version) + ". "

As mentioned above, if I ignore this log and proceed to convert to IR model, converted FP32 model's (person-reidentification-retail-0200) performance is similar to person-reidentification-retail-0079. However, converted FP16 model's (person-reidentification-retail-0200) performance is worse. Now that I checked Pytorch version, may it be the case that it affects the quality of converted FP16 model?

Please refer to attachment link and let me know if you need more clarification.

DmitriySidnev commented 4 years ago

@dinara92, I have watched the videos. As I said, the warning does not affect the result. I am not sure about GPU mode and FP16 data type because I did not try it but videos with 0200 model in CPU mode looks pretty good and I do not see any problems. Similar quality of models 0200 and 0079 can be because of simple videos (people are quite unique and 0079 is enough in this case) and the demo uses some heuristics for matching tracks that can be the reason of unexpected behavior (for example, when trajectories of people are crossed). Performance with FP16 data type depends on a used CPU (or GPU) and I really do not know why it is worse.

IRDonch commented 4 years ago

The new OpenVINO toolkit is out, so you can use the preconverted models now. Do you still observe the same issue with them?

dinara92 commented 4 years ago

@IRDonch I tested with the new Inference Engine (2020.1) , new reid models inference works fine :) After more code study, I understand now that the demo accuracy also partially relates to matching tracks heuristics. Thank you!

openvinotoolkit / open_model_zoo

person-reidentification-model-0103/107/200 (conversion) #754