Open timroepcke opened 4 years ago
I timed the different sections of run_smooth. I used the standard example model with electrolyzer power set to 1e6, so solving the model was always possible. I increased the n_intervals from 1 to 1000 in powers of 10. All timing measurements were done on my laptop. I am using the standard oemof v0.3.3. Here's what I found:
Everything under INITIALIZATION (converting from list to dict, create component objects etc.) - this does not change in relation to n_intervals, as it is only called once.
Covers the first steps of each interval, from process printing (disabled for testing) to component preparation. The time increases in a linear fashion, but is negligible overall.
The first part of the header RUN THE SIMULATION: creating a model_to_solve, updating bus constraints and writing the model to disk. It seems to grow faster than linear. What actually happens is that the LP file is written to disk during the first interval, throwing off the curve. For reasonable n_interval values between 10 and 100, it is negligible. Therefore, creating a model once and reusing it MIGHT reduce the runtime slightly, but is probably not worth the effort.
Solving the model using the cbc solver. It also seems to grow faster than linear with n_intervals. Not sure why, intervals should be independent. My guess would be that there is some initialization, which offsets the time for low numbers of intervals. With larger numbers, it grows linearly. Not much we can do here, except use another solver. Makes up between 10 and 20% of the total runtime.
The first step during HANDLE RESULTS, where (oemof's) processing.results is called. This is the lion's share, making up more than half of the runtime for longer computations. I poked around in oemof's source code and identified three main sections in the function. There is some initialization, like creating a panda's dataframe from the given model and creating a dictionary of dataframes from that. Both commands take about 10 microseconds each. During the next section, the dictionary is traversed. In total, this takes 50 microseconds (for my simple model). The last part is adding dual variables for bus constraints, which seems not to be set for my model. No time was spent here. So, the big part is iterating over the dictionary. The main time sinks are pivoting the data frame dictionary and setting sequences. I am puzzled by both things for different reasons:
Contains processing.parameter_as_dict, processing.create_dataframe and looping through the components to call their result handling function. Rising in a strict linear fashion, this part becomes the second largest contributor to the run-time at about 30 n_intervals. It consists of three parts: processing.parameter_as_dict (50% runtime), processing.create_dataframe (25% runtime) and looping over components (25% runtime)
(In run_smooth, there is a last part where generate_results is called for each component, but virtually no time was spent there).
To drive the point home that we should focus on processing.result and how we handle these results, here is a comparison of the timings for n_intervals = 100:
While we might not be able to change the functions themselves, maybe we can improve how we use them. For me, calling process.result took about 80 to 90 milliseconds. For n_intervals = 100, this added up to 8 seconds from the total of 14 seconds. Would it be possible to call this function only once, by saving the intermediate results? I mean, during the runs only the values change, not the components. Maybe there is some optimization potential right there. Similarly, we could probably use the dataframe from this function instead of calling processing.create_dataframe a second time during sim_handle_res.
Update: on closer look, the values from the two processing-functions under sim_handle_res (parameter_as_dict and create_dataframe) are only used for debugging, when the result is not optimal. So this is unnecessary work when show_debug_flag is set to False. My recommendation: at least guard the two lines where results_dict and df_results are set by a show_debug_flag check as well. Maybe even consider if this is debug info is so important to warrant a serious slowdown (about 10% of the execution time).