bird-house / flyingpigeon

WPS processes for climate model data, indices and extreme events
http://flyingpigeon.readthedocs.io/en/latest/
Apache License 2.0
19 stars 15 forks source link

benchmark for CASTf90 analogs process #48

Closed nilshempelmann closed 8 years ago

sradanov commented 8 years ago

Computer: 4 CPUs, system fedora Case1: 2years simulation, with 40 years archive, counting only castf90 execution part, not data preparation. castf90 home edition called from terminal: 10s (22s with 1 CPU) castf90 conda edition called from terminal: 12s castf90 home edition called from python command line with os.system: 12s castf90 home edition called from python command line with subprocess: 15s castf90 called in analogs process called with birdy: ~47s castf90 called in analogs process via phoenix GUI: 47s In all cases (except the 1cpu case) the system monitor shows a CPU activity at 100% for all 4 CPUs during the indicated time. Case2: 10years simulation, with 40 years archive,... castf90 home edition called from terminal: 16s (42s with 1 CPU) castf90 conda edition called from terminal: 14s castf90 home edition called from python command line with os.system: 15s castf90 home edition called from python command line with subprocess: 14s castf90 called in analogs process called with birdy: ~3:10min minimum, probably longer castf90 called in analogs process via phoenix GUI: 3:41

sradanov commented 8 years ago

Case1 commented unused processes in flyingpigeon/flyingpigeon/processes/init.py castf90 called in analogs process via phoenix GUI: 49s --> this one doesn't have a lot of influence.

cehbrecht commented 8 years ago

you can use owslib to call the process directly (without the phoenix gui or birdy comand line). There is an example as ipython notebook:

https://github.com/bird-house/flyingpigeon/blob/master/notebooks/WPS_analogs.ipynb

You can call the process async (default) or synchronously. In the notebook example async took 27 secs and sync 18 secs.

I have added some measuring in the analogs process:

https://github.com/bird-house/flyingpigeon/blob/master/flyingpigeon/processes/wps_analogs.py

In the log you can see duration for each step. Make sure you have debug log mode enabled:

See the log with the performance:

$ cd ~/.conda/envs/birdhouse/
$ tail -f var/log/pywps/flyingpigeon.log | grep took

In my case it looks like this:

DEBUG: init took 0.00246214866638 seconds.
DEBUG: get_input_data took 14.1966841221 seconds.
DEBUG: data preperation took 0.612922906876 seconds.
DEBUG: write_config took 0.000335931777954 seconds.
DEBUG: castf90 took 2.40370893478 seconds.
DEBUG: total execution took 17.2252738476 seconds.

So ... the time goes into get_input_data preparation.

sradanov commented 8 years ago

Well, Castf90 was already writing the time when it starts (after data preparation) and the time when it ends into the log file, so the times I indicated are WITHOUT data preparation, which is an issue apart. Thanks for the notebook!

cehbrecht commented 8 years ago

Case2: 10years simulation, with 40 years archive,... castf90 conda edition called from terminal: 14s

so 14 secs is the time we should expect here. If you only measure the execution time of castf90 in the wps process ... it should be the same. I don't understand how it can be minutes ...

Do we need to set some environment variables?

One could check only this code part in the process ...

If you run the wps service then use owslib with sync mode ... (birdy and phoenix only use async mode).

Can i reproduce this use case on my laptop? Do you have the input parameters?

sradanov commented 8 years ago

execution time: Well yes I agree it should, but it isn't. That's the issue.

environment variables: Since it already occupies all my 4 CPUs, for me it was not necessary to set environment variables. If it doesn't, one may set OMP_NUM_THREADS=4 (or what ever the number of threads to use...)

I tried sync mode using execute = wps.execute(identifier="analogs", inputs=[("dateSt",'2013-01-01'),("refEn",'1995-12-31')], async=False) (is that how it works?) Which did not change anything substantial: 42s with async=True, 42s with async=False.

laptop: of course you can. inputs=[("dateSt",'2013-01-01'),("refEn",'1995-12-31')], the default ones for the other parameters for case 1 and inputs=[("dateSt",'2005-01-01'),("refEn",'1995-12-31')], for case 2.

What exactly do you mean by checking only the code part in the process?

sradanov commented 8 years ago

PS: the numbers are for case1

cehbrecht commented 8 years ago

i have run use case 1 and wps-call took 45 secs for the castf90 call.

Then i took the produced config file and the same preprocessed input data and have run the conda script on the command line ... and it took 42 secs.

So ... is there something wrong with the config file or the preprocessed input data?

@nilshempelmann the archive and simulation input files are written to the etc/pywps folder ... could not see why .. open a ticket for it.

cehbrecht commented 8 years ago

i've updated the notebook with both use cases:

https://github.com/bird-house/flyingpigeon/blob/master/notebooks/WPS_analogs.ipynb

async mode is better in this long running case ;)

sradanov commented 8 years ago

ok, aparently I messed up something with the direct conda package call. I now have 42s as well for case 1, which means everything is ok with the wps, but the conda package is slow compared to the classic command line castf90.

cehbrecht commented 8 years ago

so we need to update the cast conda package ( #37 ) and find out how to improve the performance of it.

As a workaround one can use the manual installation of the castf90 ...

nilshempelmann commented 8 years ago

CASTf90 seems to be very fast. Performance bottleneck is now the sub-setting during data preparation (#80)