Open hygehyge opened 12 hours ago
OpenVINO will use FP16 by default on GPU or BF16 on CPU if the device supports them, otherwise FP32.
Also is there any point to add another option for FP32? You will double the VRAM usage and double the generation time for no reason, FP16 and FP32 outputs are basically the same. And almost every model on CivitAI or Huggingface is an FP16 model. You will do nothing but wasting compute and memory by using FP32.
@Disty0
OpenVINO will use FP16 by default on GPU or BF16 on CPU if the device supports them, otherwise FP32.
Yes, that's right.
And because of this, if model is trained in FP32 and use it with OpenVINO (which is my case), the inference result is different from expected.
If we can control precision hint for device that supports FP32 will fix this isuue.
For example, my iGPU supports FP32 and INFERENCE_PRECISION_HINT is FP16.
[ INFO ] GPU : [ INFO ] SUPPORTED_PROPERTIES: [ INFO ] AVAILABLE_DEVICES: 0 [ INFO ] DEVICE_ARCHITECTURE: GPU: vendor=0x8086 arch=v12.3.0 [ INFO ] FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics (iGPU) [ INFO ] OPTIMIZATION_CAPABILITIES: $\color{red}{\textsf{FP32}}$, BIN, FP16, INT8, EXPORT_IMPORT [ INFO ] INFERENCE_PRECISION_HINT: <Type: $\color{red}{\textsf{'float16'}}$ >
And when I set execution_mode
to ACCURACY
like below, the result seem to be as expected.
(it tooks about 2s/it, when FP16 mode tooks about 1s/it, as you mentioned.)
# in modules/intel/openvino/__init__.py
core.set_property(
"GPU",
{hints.execution_mode: hints.ExecutionMode.ACCURACY},
)
So I think to add precision control setting is useful.
Feature description
As described here, OpenVINO has precision control hints.
This allows users to select which is important, ACCURACY or PERFORMANCE.
And in my environment(OpenVINO + Intel Iris Xe iGPU), these flags affect to inference result significantly.
Version Platform Description
SD.Next:hash=5c684cb0 branch=master