Breakthrough-Energy / PreREISE

Generate input data for scenario framework
https://breakthrough-energy.github.io/docs/
MIT License
20 stars 28 forks source link

refactor: calculate constraints all at once #300

Closed dmuldrew closed 1 year ago

dmuldrew commented 1 year ago

Pull Request doc

Purpose

This PR refactors the constraints function implementation to reduce the percentage of overall runtime from 48% to about 10%. This change appears to result in a 30-40% improvement in overall runtime. This PR also moves the call to linprog into the main loop which reduces the optimization code from about 20% of the overall runtime to 8%. These two changes together result in a roughly 50% overall improvement in the runtime of the algorithm.

I also created two small testing datasets in the PR which consist of data for 5 cars: https://github.com/Breakthrough-Energy/PreREISE/blob/dmuldrew/refactor_calculate_constraints/prereise/gather/demanddata/transportation_electrification/tests/test_census_data.csv and 30 cars: https://github.com/Breakthrough-Energy/PreREISE/blob/dmuldrew/refactor_calculate_constraints/prereise/gather/demanddata/transportation_electrification/tests/profiling_census_data.csv

What the code is doing

Calculates constraints over the entire dataframe instead of calculating constraints for individual trip data within a double loop. Cost is still calculated within the loops since it is dependent on previous iterations.

Testing

manual performance testing and automated integration testing

Where to look

The performance-related changes are in smart_charging.py.

Time estimate

~15min

rouille commented 1 year ago

It is out of the scope of this PR but why segsum and segcum are parameters of the calculate_optimization function since these could be derive from seg within the function?

See line 365 of the smart_charging module

segsum = sum(seg)
segcum = np.cumsum(seg)
linprog_result = calculate_optimization(
    charging_consumption,
    rates,
    elimit,
    seg,
    segsum,
    segcum,
    total_trips,
    kwh,
)
rouille commented 1 year ago

Other comment out of the scope of this PR. In the smart_charging function. the kwhmi parameter is said to vary with model_year and veh_type that are also parameters of the function. can we have a dictionary that would map (model_year, veh_type) to kwhmi to simplify the calling of the function and also make it less obscure to the user. In your test, kwhmi=0.242, who would come up with that value.

dmuldrew commented 1 year ago

@rouille Yeah, I agree a dictionary is a good idea…we’d probably need to request that from them so we can have more comprehensive tests of the various code pathways.

dmuldrew commented 1 year ago

Latest changes reduce the optimization from around ~20% to ~8% of total runtime:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   370     14166    3624525.0    255.9      1.8                      linprog_inputs = calculate_optimization(
   371      7083      14128.0      2.0      0.0                          charging_consumption,
   372      7083      14369.0      2.0      0.0                          rates,
   373      7083      14190.0      2.0      0.0                          elimit,
   374      7083      14488.0      2.0      0.0                          seg,
   375      7083      14036.0      2.0      0.0                          segsum,
   376      7083      14784.0      2.1      0.0                          segcum,
   377      7083      13873.0      2.0      0.0                          total_trips,
   378      7083      14797.0      2.1      0.0                          kwh,
   379                                                               )
   380                                           
   381      7083   12041961.0   1700.1      6.0                      linprog_result = linprog(**linprog_inputs)

so now this single line accounts for almost half the runtime (compared to 18% before):

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   493                                                                   # copy individual back to newdata if it can be an EV
   494      6804   94103836.0  13830.7     47.2                          newdata.iloc[i : i + total_trips] = individual