yaoyuannnn / experiments

Experiments for the SMAUG project
0 stars 4 forks source link

[Question]Multiple aladdin results in the summary #1

Closed samialabed closed 2 years ago

samialabed commented 2 years ago

Hi!

I am running the design generated by running the sweeps/main.py with minerva model with only one configuration passed to the inputs.json.

I am slightly confused about the results obtained in the nnet_fwd_summary and even stats.txt. the summary file contains multiple Aladdin results of various values, I am assuming this is because of sampling? and each block is a separate sample? However, even if I manually delete the sampling_level line in the generated run.sh it still producing these blocks. Should I take the 90th percentile of the results if I wanted an estimate on the power usage?

In a previous project I did on gem5-Aladdin, the simulator only reported one block in the summary, and two blocks in the stats.txt (warmup, then the main one). So it was easy to calculate the EDP, average power consumption and execution time. I'm not sure if smaug has a different behaviour or am I looking at the right results file.

Can you please clarify or point me towards how to read the results?

Thanks!

xyzsam commented 2 years ago

Sorry for the delay.

I'm guessing what's happening is that the Minerva model contains five FC layers, and each layer is a separate run. Each section in the nnet_fwd_summary corresponds to one layer, and stats.txt should contain those five layers + warmup + shutdown.

samialabed commented 2 years ago

Hi Sam, thanks for the reply. This makes me wonder how would I best measure the accelerator's energy and the system's latency correctly, if every layer is a separate (non-deterministic) run. Do you have any suggestion/intuition on how to best understand the system behavior in this case?

Thanks!

xyzsam commented 2 years ago

That would depend on what you're trying to measure - total energy or average power, and is that for the whole system or just the accelerators? If you want total energy of the system, you'd need to add on a CPU + DRAM power model (at minimum), which gem5-aladdin doesn't provide (you would need to postprocess the results through McPAT or something). If you just care about the accelerators, energy is additive, and you can add up total cycles spent in the accelerators, so you can compute average power.

Or maybe you want to measure EDP, in which case you'd multiply energy by time instead of dividing.

samialabed commented 2 years ago

Thanks! Yea my main use-case is optimizing the accelerator design without changing the underlying system, so I dont' have to use McPAT.

To confirm my understanding: the accelerator's EDP would be: sum of each layer's energy (separate block reported in the summary file) * (1/execution time) While the average power is the (sum of each layer's energy) / (sum of the cycle of each layer too).

This resolved my confusion, thanks a lot for your help!