This PR optimizes how certain equations inside DmpCeff are evaluated:
Separate functions DmpPi::vl0 and DmpPi::dvl0dt invoked through DmpAlg::vo and DmpAlg::dVoDt respectively are always invoked together in evalVlEqns. The same conditions in DmpAlg::vo and DmpAlg::dVoDt are therefore checked twice. I refactored the code and made them into a single function that evaluates the function and its derivative at the same time
I analyzed statistics on how often conditions in DmpAlg::vo and DmpAlg::dVoDt are fulfilled and reordered them so that the most frequently fulfilled one is checked first.
To check if the optimizations work I tested the modification with OpenROAD (using OpenROAD-flow-scripts) on BlackParrot design and nandgate45 PDK. Without the modification the global placement ("3_3_place_gp") stage took approx 554.5 seconds whereas with the modification approx 547.5 seconds. The test was performed on Intel i7-8700 CPU @ 3.20GHz.
Issues or PRs should be filed with https://github.com/parallaxsw/OpenSTA if still relevant. This is effectively a fork (though not strictly for historical reasons).
This PR optimizes how certain equations inside
DmpCeff
are evaluated:Separate functions
DmpPi::vl0
andDmpPi::dvl0dt
invoked throughDmpAlg::vo
andDmpAlg::dVoDt
respectively are always invoked together inevalVlEqns
. The same conditions inDmpAlg::vo
andDmpAlg::dVoDt
are therefore checked twice. I refactored the code and made them into a single function that evaluates the function and its derivative at the same timeI analyzed statistics on how often conditions in
DmpAlg::vo
andDmpAlg::dVoDt
are fulfilled and reordered them so that the most frequently fulfilled one is checked first.To check if the optimizations work I tested the modification with OpenROAD (using OpenROAD-flow-scripts) on BlackParrot design and nandgate45 PDK. Without the modification the global placement ("3_3_place_gp") stage took approx 554.5 seconds whereas with the modification approx 547.5 seconds. The test was performed on Intel i7-8700 CPU @ 3.20GHz.