Closed aulemahal closed 4 years ago
@tlogan2000 With firstruncheck
, I am currently computing the growing season length of a full generic scenario and the memory is stable at 10-12 GB (9-10% of doris) and I estimate a computation time of 25-30 min.
Great. 25 minutes seems a bit long? What is the calc time for a 'normal' indicator. In any case at least we have a version that is memory stable
J'aimerais savoir comment tu as obtenu les graphiques de la consommation de mémoire. Tu as "scaper" un top ou il y a qq'chose qui permet de faire ça? J'ai déjà cherché qq'chose du genre mais sans succès jusqu'à présent ...
On Fri, Jan 17, 2020 at 9:35 AM Travis Logan notifications@github.com wrote:
Great. 25 minutes seems a bit long? What is the calc time for a 'normal' indicator. In any case at least we have a version that is memory stable
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Ouranosinc/xclim-benchmark/pull/3?email_source=notifications&email_token=AD7CIHFSE7IVTXAAWYSHMCLQ6G64DA5CNFSM4KH3O5KKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJH3XWQ#issuecomment-575650778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7CIHB7VCGPIVL4UWC3IGDQ6G64DANCNFSM4KH3O5KA .
@sbiner J'utilise memory_profiler. Assez cool! Je lance mon script pour chaque "expérience" avec:
>>> mprof run -C bench_gsl.py exp
Le -C
c'est pour suivre tous les fils et processus ("children"), nécessaire avec dask. Ça peut aussi faire des figures directement, mais j'ai préféré écrire mon propre code dans le script.
merci
On Fri, Jan 17, 2020 at 10:19 AM Pascal Bourgault notifications@github.com wrote:
J'utilise memory_profiler https://pypi.org/project/memory-profiler/. Assez cool! Je lance mon script pour chaque "expérience" avec:
mprof run -C bench_gsl.py exp
Le -C c'est pour suivre tous les fils et processus ("children"), nécessaire avec dask. Ça peut aussi faire des figures directement, mais j'ai préféré écrire mon propre code dans le script.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Ouranosinc/xclim-benchmark/pull/3?email_source=notifications&email_token=AD7CIHCUFAKE7F7K3YX72JLQ6HD6VA5CNFSM4KH3O5KKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJH774Y#issuecomment-575668211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7CIHFT2Q4X2ZDK2ZPTUADQ6HD6VANCNFSM4KH3O5KA .
@tlogan2000 According to my Portraits Climatiques update, 25 minutes seems normal for a 2-variable indicator (tas is made from tasmin and tasmax).
The current computation of "growing season length" in xclim uses enormous amounts of memory and usually fails with large datasets. I tested some other method to compute the thing and results are good, but less incredible than the last two benchmarks made this way.
Two methods:
xc.run_length.first_run
calls.For the second case, I tested a lot of different versions, to try and pinpoint what was responsible for the memory consumption. The best way, is
exp_firstruncheck
.Graphs: 1) Small chunks (50x50) and many years (99).
2) Large chunks (200x200) and fewer years (50).
Conclusion is that the default version with small tweaks can be sped up and made to take less memory. But, the method with
first_run
while being slower, consumes a lot less memory and does so more stabily.I yet have to test with data that has chunks smaller than a year. More to come.