generate_load_profiles_parallel runs 10 times slower and uses double RAM

RAMP-project / RAMP

Repository of the open-source RAMP model for generating multi-energy loads profiles

European Union Public License 1.2

61 stars 33 forks source link

generate_load_profiles_parallel runs 10 times slower and uses double RAM #144

Open Stevogallo opened 6 months ago

Stevogallo commented 6 months ago

I was testing the newly added option "generate_load_profiles_parallel", that according to your comments should be faster. But I'm experiencing tenfold computational times, and double RAM use, compared to when I use "generate_load_profiles", the old function.

Bachibouzouk commented 6 months ago

Hello @Stevogallo , thanks for your issue opening. The speed of the parallel processing might depend on local configuration. Would you be able to share your usecase so that I can perform tests on my computer? In this case your CPU and RAM on your laptop might be of interest as well if you are ok to share this information.

Stevogallo commented 6 months ago

Hi @Bachibouzouk, I'm using the default "Input File 1". I'm working on a jupyter notebook we created: https://github.com/SESAM-Polimi/RAMP-Jupyter because it turns much easier to teach in class.

My laptop: Processor 13th Gen Intel(R) Core(TM) i7-1355U, 1700 Mhz, 10 Core(s), 12 Logical Processor(s) Installed Physical Memory (RAM) 16.0 GB Windows 11 Pro

Let me know if you need something else.

Bachibouzouk commented 6 months ago

Hi @Stevogallo - thanks for the informations, do you use one of the two notebooks under ramp/Jupyter Notebooks in the repository you provided?

After a quick research, it seems other people have also issues with multiprocessing, jupyter and windows. Could you check running a simple code within a python script to test if the problem could be jupyter?

Stevogallo commented 6 months ago

Sorry @Bachibouzouk, I didn't specify. I'm using "\RAMP-Jupyter-main\RAMP-Jupyter-main\ramp\Jupyter Notebooks\RAMP Example Village - Excel.ipynb"

I'll try without jupyter and let you know.

Stevogallo commented 6 months ago

@Bachibouzouk an update: I runned Input_file_1 from Visual Studio, here the results: Time taken by generate_daily_load_profiles_parallel: 950.19 seconds Time taken by generate_daily_load_profiles: 105.32 seconds

Bachibouzouk commented 6 months ago

@Stevogallo - I also tested locally and the parallel processing takes more time. As the multiprocessing version is not pinned, it can be due to newer version of multiprocessing. In the profiling the acquire method of the _threads.lock object takes up 50% of the time. I unfortunately don't have the resources to investigate this at the moment :(

Bachibouzouk commented 6 months ago

I still was puzzled by this and looked at it more closely: it is depending on the chunksize parameter (how many tasks get attributed as a batch to a CPU) quite a lot. Now the latter is fixed to 4. I tried to fiddle with it a bit and I could get to run the code in 3 seconds with // processing versus 17 seconds without // processing. The problem is that it depends on the size of the task list, and this depends on how many days, appliances and user we have. I will try to find a compromise. Meanwhile you can try to fiddle with the parameter yourself @Stevogallo

https://github.com/SESAM-Polimi/RAMP-Jupyter/blob/3346df93e78b202dd0820b712eb80cd1113817e9/ramp/core/core.py#L476