CSHS-CWRA / CSHShydRology

Main package
GNU Affero General Public License v3.0
35 stars 39 forks source link

Example run times #56

Closed KevinShook closed 2 years ago

KevinShook commented 2 years ago

Several examples are being flagged in the check --as-cran as taking excessively long to execute

Examples with CPU (user + system) or elapsed time > 5s
                      user system elapsed
ch_flow_raster_trend 9.152  0.428   9.584
ch_regime_plot       8.354  0.375   8.730
ch_polar_plot        8.150  0.371   8.528

We should probably try using them with smaller data sets.

Session Info ``` ```
PaulWhitfield commented 2 years ago

While this is not unexpected, smaller datasets are not the best solution in all cases. Are these run times for the complete examples or just the single function?

For ch_regime_plot we could subset the data using ch_subset_Years in the example. I am surprised that this function takes that much time.

For_ch_flow_raster_trend a shorter dataset will create more issues as it needs sufficient years and the issue is the large number of calculations, here it would be better to use #notrun

For ch_polar_plot, changing the length of the input data will not affect the run time as the input data is the same length in all cases. It might be that the time is actually in ch_binned_MannWhitney as that is where all the calculations are done. If it is actually in ch_polar_plot then I suspect we are 'paying' for the runtime in plotrix::radial.plot I suggest we use #notrun

Paul

On Tue, 26 Oct 2021 at 21:14, KevinShook @.***> wrote:

Several examples are being flagged in the check --as-cran as taking excessively long to execute

Examples with CPU (user + system) or elapsed time > 5s user system elapsed ch_flow_raster_trend 9.152 0.428 9.584 ch_regime_plot 8.354 0.375 8.730 ch_polar_plot 8.150 0.371 8.528

We should probably try using them with smaller data sets. Session Info

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CSHS-CWRA/CSHShydRology/issues/56, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFB73C3COALTF7QWZP5IWDUI5VBTANCNFSM5GZIKPDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

KevinShook commented 2 years ago

The run times appear to be for the all of the code in the examples for a given function. I did the runs on an i7 running Linux - maybe you could try to run it on your system to see if you get similar values. I will look at profiling the example code to see what is slowing things down

KevinShook commented 2 years ago

I just ran the profiler on ch_regime_plot. In this case the culprit is ch_doys. This function alone accounts for half of the run time of ch_regime_plot.

Looking at ch_doys I see that it uses a couple of loops. I don't think that these are necessary. Do you want me to rewrite the function to remove them, or would you like to do it?

KevinShook commented 2 years ago

I just checked and ch_doys is also the culprit in ch_flow_raster_trend

PaulWhitfield commented 2 years ago

If you can remove the loops and do it some other way, great. As I always say ... my code always looks like Rortran. Getting rid of the loops is a good idea.

On Wed, 27 Oct 2021 at 10:39, KevinShook @.***> wrote:

I just ran the profiler on ch_regime_plot. In this case the culprit is ch_doys. This function alone accounts for half of the run time of ch_regime_plot.

Looking at ch_doys I see that it uses a couple of loops. I don't think that these are necessary. Do you want me to rewrite the function to remove them, or would you like to do it?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CSHS-CWRA/CSHShydRology/issues/56#issuecomment-953107231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFB73G55SQ6G53CG522TADUJATL3ANCNFSM5GZIKPDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

PaulWhitfield commented 2 years ago

Yes, as I recollect ch_doys is always slow.

On Wed, 27 Oct 2021 at 10:43, KevinShook @.***> wrote:

I just checked and ch_doys is also the culprit in ch_flow_raster_trend

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CSHS-CWRA/CSHShydRology/issues/56#issuecomment-953110581, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFB73ERAG2MWU76UNVWZV3UJAT4PANCNFSM5GZIKPDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

KevinShook commented 2 years ago

Will do. I think I may have a function somewhere that does it already.

PaulWhitfield commented 2 years ago

much better performance with the changes

KevinShook commented 2 years ago

Thanks. BTW, the profiling package profvis is really useful. It creates an interactive graph showing where the greatest bottlenecks are in the code. I think I'll be using it in the future.

PaulWhitfield commented 2 years ago

What are the new run times with the changes?

KevinShook commented 2 years ago

The check command doesn't flag them as they are < 5 seconds. However, if I try them manually I get:


system.time(ch_regime_plot(CAN05AA008, colour = TRUE, wyear = 1))
   user  system elapsed 
  0.454   0.000   0.454 ```

```system.time(mplot <- ch_flow_raster_trend(CAN05AA008, step=5))
[1] "Bins = 73  The number of extra points in last bin is up to  1  per year"
   user  system elapsed 
  0.457   0.000   0.456 ```
PaulWhitfield commented 2 years ago

Yes, this is much better.