MAGALA-RICHARD / apsimNGpy

ApsimNGpy is a Python library designed as an implementation of the APSIM Next Generation crop and environmental simulation model. This library is built upon the Pythonnet API, providing direct communication with the APSIM model. Notably, it offers a significant speed advantage, being twice as fast as the graphical user interface for simulations.
Apache License 2.0
15 stars 2 forks source link

How to reduce parallel runtime #10

Closed zebra-fgd closed 7 months ago

zebra-fgd commented 8 months ago

Dear Author, I have a problems lately. I wrote a simple multi-process test program with reference to apsimNGpy.parallel.examples.py, but the improvement in results compared to a single-process run is small.

import os
import time

os.environ['APSIM'] = f'D:\\application\\APSIM\\bin'
from apsimNGpy.parallel.process import run_apsimxfiles_in_parallel, read_result_in_parallel
from apsimNGpy.utililies.utils import collect_runfiles
from apsimNGpy.utililies.utils import make_apsimx_clones
from apsimNGpy.utililies.run_utils import run_model

hd = r'D:\project\apsimNGpy\da\my\data\tmp\apsimx_cloned'
maize = r'D:\project\apsimNGpy\da\my\data\tmp\maize.apsimx'

if __name__ == "__main__":
   # For some special reasons, I had to have 6 files in parallel, and run multiple times.
    for i in range(10):
        ap = make_apsimx_clones(maize, 6)
        files = collect_runfiles(path2files=hd, pattern=["*.apsimx"])  # change path as needed
        t = set(run_apsimxfiles_in_parallel(ap, ncores=6, use_threads=False))
        ap = make_apsimx_clones(maize, 6)
        st = time.time()
        t_n = 0
        for f in ap:
            st1 = time.time()
            run_model(f)
            et1 = time.time()
            print(f'no.{i}-{t_n}', str(et1 - st1), 'seconds', 'to run 1 files')
            t_n += 1
        et = time.time()
        print(f'no.{i}', str(et-st), 'seconds', 'to run 6 files')
Running apsimx files: 100% completed
34.973912000015844 seconds to run 6 files
no.0-0 12.061067342758179 seconds to run 1 files
no.0-1 3.7569408416748047 seconds to run 1 files
no.0-2 3.6963162422180176 seconds to run 1 files
no.0-3 3.67946195602417 seconds to run 1 files
no.0-4 3.728914260864258 seconds to run 1 files
no.0-5 3.655278444290161 seconds to run 1 files
no.0 30.592000246047974 seconds to run 6 files
Running apsimx files: 100% completed
32.382027999992715 seconds to run 6 files
no.1-0 3.667325019836426 seconds to run 1 files
no.1-1 3.6620712280273438 seconds to run 1 files
no.1-2 3.76507306098938 seconds to run 1 files
no.1-3 3.678429126739502 seconds to run 1 files
no.1-4 3.672969102859497 seconds to run 1 files
no.1-5 3.6747539043426514 seconds to run 1 files
no.1 22.138978242874146 seconds to run 6 files
Running apsimx files: 100% completed
33.97747179999715 seconds to run 6 files
no.2-0 3.6591010093688965 seconds to run 1 files
no.2-1 3.704801321029663 seconds to run 1 files
no.2-2 3.787921190261841 seconds to run 1 files
no.2-3 3.6740224361419678 seconds to run 1 files
no.2-4 3.692131996154785 seconds to run 1 files
no.2-5 3.6714439392089844 seconds to run 1 files
no.2 22.209059476852417 seconds to run 6 files
Running apsimx files: 100% completed
33.19529509998392 seconds to run 6 files
no.3-0 3.6933140754699707 seconds to run 1 files
no.3-1 3.716364622116089 seconds to run 1 files
no.3-2 3.6551353931427 seconds to run 1 files
no.3-3 3.660146713256836 seconds to run 1 files
no.3-4 3.751201629638672 seconds to run 1 files
no.3-5 3.6740548610687256 seconds to run 1 files
no.3 22.170230865478516 seconds to run 6 files
......

30s for multi-process operation, 30s for first single-process operation, 22s for other single-process operation. I guss In which loading the dotnet environment takes about 8s. Then this doesn't look like it's running in parallel.

Can the code be improved to reduce runtime? Ideally, can the runtime be reduced to about 10s?

MAGALA-RICHARD commented 8 months ago

The difference may not be noticeable with a smaller number of runs, such as those mentioned. I would suggest considering reducing the number of cores, perhaps to half or two, and observing any differences. Sometimes, optimizing performance involves exploration, particularly with tasks of this scale, which, in the grand scheme of things, might seem minor. Another aspect to consider is experimenting with increasing the count to 200 files to see the variations

zebra-fgd commented 7 months ago

Thanks for the tip, I lowered the number of cores to execute the code and did not lower the parallel execution time. However I replaced my computer with one with the same performance as mine to execute the code again, and the parallel run time was much shorter, so I'm guessing there might be something wrong with my computer.