Closed efzulian closed 6 months ago
I've had some time to think about this issue, and the bigger picture related to DRAMPower as a whole, so this is going to be a long post, that we can hopefully partially reuse as part of the readme in the future.
It is about time we properly spec what DRAMPower actually calculates, and trim where possible. This touches issues #26, #40 and #41.
DRAMPower has 30+ energy-related outputs (listed in MemoryPowerModel::power_calc
), 16 cycle counters (listed in CommandAnalysis::clearStats
), and a handful of timestamps that are tracked along the way. This seems excessive.
The last 9 of the energy outputs relate to IO / termination. Let's ignore those for now to limit the scope of what we are doing here.
Commands consume a certain amount of active energy to execute. Executing a command can change the power-state of the memory. In each power state, a specific background power is consumed.
Goals:
Now, calculating the total energy usage is conceptually simple:
_Action points_ I would like to see the structure of this procedure reflected in the code, i.e.:
In CommandAnalysis.cc
and friends:
In MemoryPowerModel.cc
and friends:
There are 5 (non-overlapping) power states to distinguish. In each of these states, a specific background current is consumed:
If we see fast and slow-exit as two distinct cases, then we should need only 6 cycle counters. After we implement #26, we can go back to 5. I've referred to this image before:
(I'm looking at MemoryPowerModel.cc, line 150+ here):
These 5 energy values we calculate seem to (approximately) correspond to the active power of individual commands: (act_energy, pre_energy, read_energy, write_energy, ref_energy
). Working backwards from the currents, the mapping to background currents per power state is probably:
precycles --> energy.pre_stdby_energy
actcycles --> energy.act_stdby_energy
f_pre_pdcycles + s_pre_pdcycles --> energy.*_pre_pd_energy
sref_cycles_idd6
?? I am not completely sure what is happening in engy_sref()
:((idd6 * sref_cycles_idd6) + ((idd5 - idd3n) * (sref_ref_act_cycles
+ spup_ref_act_cycles + sref_ref_pre_cycles + spup_ref_pre_cycles)))
* vdd * clk;
(idd6 * sref_cycles_idd6)
is a background energy component, while the second term is an active component. So I think that self-refreshes are not counted in ref_energy
.
I see two options:
My person preference is option 2: I think we currently have a complex code base, that calculates more outputs than most of our users are interested in, and interpreting what they mean is much harder than it needs to be. Maintenance is not fun. We can do better, so I think an overhaul is required. Even though we have released a few versions with useful upgrades, a new release with a real overhaul of the core code would make this project much more viable I think.
I'll stop here for now, since I think it is good to give an opportunity for feedback first. If this is the direction we are moving into, then I would be happy to set up the class-infrastructure I have in mind.
I believe that engy_sref() is intended to return the self-refresh active energy without the background energy.
Some text extracted from Karthik Chandrasekar's PhD thesis, 2.5.4 Self-Refresh Mode Transition:
The IDD6 current is consumed for the time period spent in the self-refresh mode as defined in the trace (nSR), which excludes the time spent in finishing the explicit auto-refresh (as depicted in Figure 2.13). The auto-refresh consumes IDD5 − IDD3N over one refresh period (nRFC) from the start of the self-refresh. IDD2N current is consumed when exiting the self-refresh state for the nXSDLL exit period.
Briefly, the time spent in self-refresh is broken in three parts:
Furthermore, he (Karthik in his thesis) explores different situations regarding the arrival time of the self-refresh exit command (all scenarios are described in the issue #39).
I think you are right: self-refreshes are not counted in ref_energy. There is sref_energy for that.
About the changes, my vote goes to option 2. Of course I volunteer to help.
In my opinion, the first thing to do would be a clean-up removing everything that is unnecessary reducing the chance that people extend it. I see that you are doing lots of improvements it in the refact_ca branch.
Note to self: ignore the "SRE" edge, and the little state-bubble it flows into in the state diagram (its a simplification at best, the figure can be improved). The general rule that DRAMPower follows when it comes to modeling SREF is:
On SREN:
On SRX:
We always end up in an IDD2N state.
The idea here is to review the counters of cock cycles in self-refresh power-up mode for all possible self-refresh exit contexts.
Both scenarios with and without DLL should be considered.
Other counters like "latest_pre_cycle" (which are assigned in the same context) should also be considered in the analysis.
Some code:
Additionally, there are useful comments in the pull-request #39.
It is also important to evaluate the relevance of these counters (spup_cycles and latest_pre_cycle). It seems that they are used to calculate some energy components which do not make part of the total trace / pattern energy (total_energy). E.g.: latest_pre_cycle --> idle_pre_update() --> idlecycles_pre --> energy.idle_energy_pre --> cout
The removal of irrelevant code chunks / variables (if any) would be beneficial to the code maintainability.