Closed aagoksoy closed 1 year ago
Yes, you're right the ADC precision specifies the precision of the output of the vector-matrix multiplication. The register + shift-adder and other accumulations (for CNN layers) will all increase the precision before the activation is sent to the next layer. It is possible that due to all of these accumulations the precision of the activation becomes larger than the desired input precision and must be quantized again. This would be similar to how a floating-point multiplication is done in a normal processor where the mantissa calculation is done in a higher precision and then quantized at the end.
Then, does this NeuroSim architecture have the digital modules that support the quantization (reduce the increased precision to the input precision) after all accumulations of the partial sums?
Currently, the digital modules that would perform this quantization are not included in NeuroSim.
I'm trying to understand how the quantization works with different ADC precision and input vector precision. After vector-matrix multiplication, ADC output determines the precision for partial sums for the operation of that layer. For the next layer's operations, we use the sums that are coming from that tile. So, is the ADC precision of the first tile equal to the input vector precision of the second layer's tile? If so, the activation precision (Input vector precision) becomes nonimportant. My guess is that the Adder and register pair change the bit precision as input vector precision rather than ADC precision. Am I right, or missing something?