Open alfa871212 opened 10 months ago
Hello,
The --inference flag enables or disables the hardware-aware inference accuracy PyTorch simulation. If you want to know the inference accuracy given certain properties of your memory device, array and peripherals (e.g. on/off ratio, ADC precision, array size) you should set --inference 1.
If you are not concerned with the inference accuracy and only want to see the power, performance and area (PPA) of the CIM accelerator, use --inference 0. With this setting, the inference accuracy will not consider the device properties you provide in the commands. The properties will only be considered in the C++ code that estimates the PPA.
Hope this helps! Please check the manual for a more in depth description of the functionalities of NeuroSim. James
It means that if the inference flag is set to 1, the accuracy is not trustable. (Because it still generates test log under the log dir).
If you want to simulate the inference accuracy of the model using CIM you should use --inference 1. If you don't care about the effects of CIM on the accuracy and only want to see the PPA, use --inference 0.
NeuroSim will output the accuracy and PPA both times, however if you use --inference 0 the simulation will only consider the input and weight quantization of the model, not any effects from the in-memory computing circuits.
What is the purpose of
--inference 0
in inference.py? If we keep the other flags the same, and use the same Param.cpp. It takes around 50mins to run for inference flag 1, while it takes around 50secs to run for inference flag 0. The spec of the PC is i9-12900k with 3080ti. The inference accuracy and other metrics generated by NeuroSim also seem similar. Thank you very much.The commands are listed below
python inference.py --dataset cifar10 --model VGG8 --mode WAGE --inference 1 --cellBit 1 --subArray 128 --parallelRead 128 --ADCprecision 4 --onoffratio 10 --logdir N7SRAM/VSA_VGG/ > N7SRAM/VSA_VGG/N7_SRAM_inf_1_onoffratio10.txt
The output from NeuroSim:Energy Efficiency TOPS/W (Layer-by-Layer Process): 74.7287 Throughput TOPS (Layer-by-Layer Process): 3.86618 Throughput FPS (Layer-by-Layer Process): 3138.56 Compute efficiency TOPS/mm^2 (Layer-by-Layer Process): 0.18728
Comparison
python inference.py --dataset cifar10 --model VGG8 --mode WAGE --inference 0 --cellBit 1 --subArray 128 --parallelRead 128 --ADCprecision 4 --onoffratio 10 --logdir N7SRAM/VSA_VGG/ > N7SRAM/VSA_VGG/N7_SRAM_inf_1_onoffratio10.txt
The output from NeuroSim:Energy Efficiency TOPS/W (Layer-by-Layer Process): 69.8384 Throughput TOPS (Layer-by-Layer Process): 3.86618 Throughput FPS (Layer-by-Layer Process): 3138.56 Compute efficiency TOPS/mm^2 (Layer-by-Layer Process): 0.18728