Let's support new CWQ algorithms for the following Ops.
1. PRelu
Alpha is quantized per-channel. Alpha has a single value per channel, so its fp value is exactly restored from quantized value.
uint8
Elem
Input
Alpha
Output
Dtype
uint8
uint8
uint8
Granularity
per-tensor
per-channel
per-tensor
int16
Elem
Input
Alpha
Output
Dtype
int16
int16
int16
Granularity
per-tensor
per-channel
per-tensor
2. Instance Normalization
gamma and beta are quantized per-channel. Epsilon is saved as-is (fp32). Gamma (and beta) has a single value per channel, so its fp value is exactly restored from quantized value.
Let's support new CWQ algorithms for the following Ops.
1. PRelu
Alpha is quantized per-channel. Alpha has a single value per channel, so its fp value is exactly restored from quantized value.
uint8
int16
2. Instance Normalization
gamma and beta are quantized per-channel. Epsilon is saved as-is (fp32). Gamma (and beta) has a single value per channel, so its fp value is exactly restored from quantized value.
uint8
int16