Intermediate Activation Quantization

neurosim / DNN_NeuroSim_V1.4

Benchmark framework of compute-in-memory based accelerators for deep neural network (inference engine focused)

53 stars 12 forks source link

Intermediate Activation Quantization #2

Closed aagoksoy closed 1 year ago

aagoksoy commented 1 year ago

I'm trying to understand how the quantization works with different ADC precision and input vector precision. After vector-matrix multiplication, ADC output determines the precision for partial sums for the operation of that layer. For the next layer's operations, we use the sums that are coming from that tile. So, is the ADC precision of the first tile equal to the input vector precision of the second layer's tile? If so, the activation precision (Input vector precision) becomes nonimportant. My guess is that the Adder and register pair change the bit precision as input vector precision rather than ADC precision. Am I right, or missing something?

neurosim commented 1 year ago

Yes, you're right the ADC precision specifies the precision of the output of the vector-matrix multiplication. The register + shift-adder and other accumulations (for CNN layers) will all increase the precision before the activation is sent to the next layer. It is possible that due to all of these accumulations the precision of the activation becomes larger than the desired input precision and must be quantized again. This would be similar to how a floating-point multiplication is done in a normal processor where the mantissa calculation is done in a higher precision and then quantized at the end.

parkface commented 1 year ago

Then, does this NeuroSim architecture have the digital modules that support the quantization (reduce the increased precision to the input precision) after all accumulations of the partial sums?

neurosim commented 1 year ago

Currently, the digital modules that would perform this quantization are not included in NeuroSim.