neurosim / DNN_NeuroSim_V1.3

Benchmark framework of compute-in-memory based accelerators for deep neural network (inference engine focused)
62 stars 36 forks source link

Questions about AdderTree::CalculateLatency #37

Closed onefanwu closed 1 year ago

onefanwu commented 1 year ago

Hello @neurosim , I want to ask about the meaning of the first parameter (numRead) of the AdderTree:: CalculateLatency(void AdderTree::CalculateLatency(double numRead, int numUnitAdd, double _capLoad)), at the line 120 of AdderTree.cpp.

Question 1

At the line 694 of Chip.cpp, Gaccumulation->CalculateLatency(ceil(numTileEachLayer[1][l]netStructure[l][5](numInVector/(double) Gaccumulation->numAdderTree)), numTileEachLayer[0][l], 0); I want to know why netStructure[l][5] needs to be multiplied by numTileEachLayer[1][l].

Question 2

At the line 489 of Tile.cpp, *accumulationCM->CalculateLatency((int)(numInVector/param->numBitInput)ceil(param->numColMuxed/param->numColPerSynapse), numPE, 0); I want to know what ceil (param ->numColMixed/param ->numColPerSynapse) means. Moreover, when param->numColPerSynapse is greater than param ->numColMixed, its value (ceil (param ->numColMixed/param ->numColPerSynapse)**) is 0, because both parameters are of type int. Is there a bug?

neurosim commented 1 year ago

Hi! Thanks for your interest in our work and careful reviews of the codes! Many good questions! numRead means how many times to use the adder trees. For Q1, I think there is a typo. The numTileEachLayer[1][l] should be numColPerSynapse. In other words, the numTileEachLayer[1][l]*netStructure[l][5] should be numColPerSynapse*netStructure[l][5], which equals to weightMatrixCol. So the first parameter is the total times to use all the adder trees divided by the number of adder trees. For Q2, because the number of adder tree is defined as numPE*param->numColSubArray/param->numColMuxed, so the first parameter is in fact numInVector/param->numBitInput*numPE*param->numColSubArray/param->numColPerSynapse/(numPE*param->numColSubArray/param->numColMuxed) if not considering the memory utilization. Usually we won't set the value of param->numColPerSynapse greater than param ->numColMixed, but thanks for pointing out the potential bug and we will fix it. And BTW, here we use the real number of columns of the weight matrix for the chip-level adder tree (weightMatrixCol), while not considering the memory utilization in tile level (numPE*param->numColSubArray). In theory, we should make them consistent, but it won't affect the results much anyway.

onefanwu commented 1 year ago

Thank you very much for your detailed and critical reply, which is very helpful.