Closed lalith-mcw closed 1 year ago
bf16_config = {"ENFORCE_BF16" : "YES"}
network = iecore.read_network(model = ".\\unet-camvid-onnx-0001\\intel\\unet-camvid-onnx-0001\\FP16\\unet-camvid-onnx-0001.xml", weights = ".\\unet-camvid-onnx-0001\\intel\\unet-camvid-onnx-0001\\FP16\\unet-camvid-onnx-0001.bin" )
iecore.load_network(network = network, device_name="CPU", config=bf16_config)
Tried the same command with model available online this is reading and loading the model without any errors being thrown
No nodes in my model does have a shape of -1
still unsure why the error is being popped up
Hi, @lalith-mcw. It is possbile to share the model which cannot be loaded? Also could you please clarify OpenVINO version you are using? I see the exception is thrown on read_network stage, however ENFORCE_BF16 shouldn't affect this stage at all, so it is hard to predict how this can be connected.
@dmitry-gorokhov did some mistake from my end was reading fp16 weights file for a fp32 model
Still I do have issues while running the model without any precision hint settings in iGPU
@lalith-mcw Could you please provide more details on issues you are facing with iGPU launch?
[ERR] 2023-02-07T08:06:35z core\src\util.cpp 87 malloc failed to allocate memory of size 1073741889
Traceback (most recent call last):
File "C:\Users\amduser2\Documents\Lalith\openvino_fp16\fp16\demo.py", line 99, in <module>
main(args)
File "C:\Users\amduser2\Documents\Lalith\openvino_fp16\fp16\demo.py", line 34, in main
engine = StableDiffusionEngine(
File "C:\Users\amduser2\Documents\Lalith\openvino_fp16\fp16\stable_diffusion_engine.py", line 42, in __init__
self.unet = self.core.compile_model(self._unet, device)
File "C:\Users\amduser2\Documents\Lalith\check_fp16\lib\site-packages\openvino\runtime\ie_api.py", line 266, in compile_model
super().compile_model(model, device_name, {} if config is None else config)
RuntimeError: bad allocation
My system does have 16GB of memory, this is regarding FP32 models in iGPU - I7-1165G unet_fp32_static.zip
Link for the binary file: https://drive.google.com/file/d/1J0aX9GlonZDy4PS2i24QIREEwvgZGM9J/view?usp=share_link
Whereas I was able to run fp16 compressed models without issues in iGPU
, issue is only regarding fp32 models
@lalith-mcw If FP16 model works, then most likely you have not enough memory to run FP32 model. As I can see, weights size is ~3.6GB for FP32 and intermediate tensors are also quite huge (~2.5-3GB in total), so GPU plugin requires ~6GB memory to execute this model. So if your stable diffusion demo retains original ov::Model after compilation (+3.6GB), load another models or use streams/batch, then memory consumption may exceed 16GB and that may cause bad alloc exception.
Could you check if single unet model can be successfully loaded to GPU plugin using becnhmark_app on your machine?
Also, you can try to query memory statistics from GPU plugin using ov::intel_gpu::memory_statistics property to check how much memory is used on different pipeline stages (e.g. after compilation of each model).
@vladimir-paramuzov
Failed to set property to 'GPU' which is not found in the target devices list 'CPU'!
But with query_device script - its failing due to undefined precision-hint I suppose https://github.com/openvinotoolkit/openvino/issues/15552 :
[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES:
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 8
[ INFO ] FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[ INFO ] OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ] CACHING_PROPERTIES: {}
[ INFO ] CACHE_DIR:
[ INFO ] NUM_STREAMS: 1
[ INFO ] AFFINITY: Affinity.NONE
[ INFO ] INFERENCE_NUM_THREADS: 0
[ INFO ] PERF_COUNT: False
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]
[ INFO ] GPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES: 0
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 2
[ INFO ] OPTIMAL_BATCH_SIZE: 1
[ INFO ] MAX_BATCH_SIZE: 1
[ INFO ] CACHING_PROPERTIES: {'GPU_UARCH_VERSION': 'RO', 'GPU_EXECUTION_UNITS_COUNT': 'RO', 'GPU_DRIVER_VERSION': 'RO', 'GPU_DEVICE_ID': 'RO'}
[ INFO ] DEVICE_ARCHITECTURE: GPU: v12.0.0
[ INFO ] FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics (iGPU)
[ INFO ] DEVICE_UUID: UNSUPPORTED TYPE
[ INFO ] DEVICE_TYPE: Type.INTEGRATED
[ INFO ] DEVICE_GOPS: UNSUPPORTED TYPE
[ INFO ] OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
[ INFO ] GPU_DEVICE_TOTAL_MEM_SIZE: UNSUPPORTED TYPE
[ INFO ] GPU_UARCH_VERSION: 12.0.0
[ INFO ] GPU_EXECUTION_UNITS_COUNT: 96
[ INFO ] GPU_MEMORY_STATISTICS: UNSUPPORTED TYPE
[ INFO ] PERF_COUNT: False
[ INFO ] MODEL_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True
[ INFO ] CACHE_DIR:
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 1
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'undefined'>
[ INFO ] DEVICE_ID: 0
@lalith-mcw you need to call get_property(), not set_property(). Something like
stat = core.get_property('GPU', 'GPU_MEMORY_STATISTICS')
Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen to ask any questions related to this topic.
Trying to run a BF16 simulation on
Intel I7-1035G
, which doesn't have a nativeavx512_bf16
implementationReference: https://docs.openvino.ai/2021.4/openvino_docs_IE_DG_Bfloat16Inference.html?sw_type=switcher-python
But my model is failing with following error:
When using dynamic shapes with default inferencing(fp32 inferencing using fp16 models) it doesn't fail and if the same is used in BF16 inferencing with fp16 models the model is failing. Earlier tried using dynamic shapes in FP32 inferencing with FP32 model thats working fine without issues for dynamic shapes