Open EmbeddedPaul166 opened 4 months ago
Any updates on this?
Have the same issue with GroupNormalization. I can't go further with my research without this fix. Hope You will find a solution.
Ref. 149211
@EmbeddedPaul166 performed a quick test with the provided steps on a MTL with NPU (Intel Core Ultra 7 155H) and the issue is not observed. Please try using the latest OpenVINO version 2024.3 and the latest NPU driver and see if the issue is fixed on your end. You can refer to the model conversion to OpenVINO IR in the code snippet below. Hope this helps.
@dziulek please try also on your end with the latest OpenVINO/NPU driver. If the issue persists please share a sample reproducer (model definition, conversion steps, application code).
# tf model definition
import tensorflow as tf
inp = tf.keras.Input((None, None, 1), dtype=tf.float32)
y = tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu')(inp)
y = tf.keras.layers.GroupNormalization(4)(y)
model = tf.keras.Model(inputs=inp, outputs=y)
tf.keras.models.save_model(model, 'test_model.keras')
# model conversion to OpenVINO IR
import tensorflow as tf
from openvino import convert_model, save_model
model_1_path = "test_model.keras"
model = tf.keras.models.load_model(model_1_path)
model.export('test_model')
ov_model = convert_model('test_model', input=("input_layer", [1,256,256,1]))
save_model(ov_model, 'test_model.xml')
$ benchmark_app -m test_model.xml -d NPU -t 5
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-16041-1e3b88e4e3f-releases/2024/3
[ INFO ]
[ INFO ] Device info:
[ INFO ] NPU
[ INFO ] Build ................................. 2024.3.0-16041-1e3b88e4e3f-releases/2024/3
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(NPU) performance hint will be set to PerformanceMode.THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 2.54 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ] input_layer (node: input_layer) : f32 / [...] / [1,256,256,1]
[ INFO ] Model outputs:
[ INFO ] output_0 (node: functional_1/group_normalization_1/Reshape_3) : f32 / [...] / [1,256,256,32]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ] input_layer (node: input_layer) : f32 / [N,H,W,C] / [1,256,256,1]
[ INFO ] Model outputs:
[ INFO ] output_0 (node: functional_1/group_normalization_1/Reshape_3) : f32 / [...] / [1,256,256,32]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 468.60 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] DEVICE_ID:
[ INFO ] ENABLE_CPU_PINNING: False
[ INFO ] EXECUTION_DEVICES: NPU
[ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float16'>
[ INFO ] LOADED_FROM_CACHE: False
[ INFO ] MODEL_PRIORITY: Priority.MEDIUM
[ INFO ] NETWORK_NAME: TensorFlow_Frontend_IR
[ INFO ] NPU_COMPILATION_MODE_PARAMS:
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] PERFORMANCE_HINT: PerformanceMode.THROUGHPUT
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 1
[ INFO ] PERF_COUNT: False
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'input_layer'!. This input will be filled with random values!
[ INFO ] Fill input 'input_layer' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 5000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 150.87 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:NPU
[ INFO ] Count: 44 iterations
[ INFO ] Duration: 5547.39 ms
[ INFO ] Latency:
[ INFO ] Median: 501.86 ms
[ INFO ] Average: 486.99 ms
[ INFO ] Min: 145.57 ms
[ INFO ] Max: 523.53 ms
[ INFO ] Throughput: 7.93 FPS
It works now, thank you :D
OpenVINO Version
2024.1.0
Operating System
Windows System
Device used for inference
NPU
Framework
Keras (TensorFlow 2)
Model used
Custom
Issue description
Consider a following workflow:
Problem: Adding GroupNormalization layer makes benchmark_app crash on NPU.
Tests were performed on a laptop with Intel Core Ultra 7 155H CPU. Tensorflow version was 2.14.0.
Step-by-step reproduction
Step 1: Model creation in tensorflow
import tensorflow as tf inp = tf.keras.Input((None, None, 1), dtype=tf.float32) y = tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu')(inp) y = tf.keras.layers.GroupNormalization(4)(y) model = tf.keras.Model(inputs=inp, outputs=y) tf.keras.models.save_model(model, 'test_model')
Step 2: Model conversion to openvino format
mo.exe --saved_model_dir .\test_model --input input_1 --input_shape [1,256,256,1]
Step 3: Performing benchmark on NPU
benchmark_app.exe -m saved_model.xml -hint ctput -data_shape "[1, 256, 256, 1]" -inference_only -report_type detailed_counters -d NPU
Relevant log output
Issue submission checklist