pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.56k stars 1.07k forks source link

Comprehensive benchmarking suite #4648

Open dcherian opened 3 years ago

dcherian commented 3 years ago

I think a good "infrastructure" target for the NASA OSS call would be to expand our benchmarking suite (https://pandas.pydata.org/speed/xarray/#/)

AFAIK running these in a useful manner on CI is still unsolved (please correct me if I'm wrong). But we can always run it on an NCAR machine using a cron job.

Thoughts?

cc @scottyhq

A quick survey of work needed (please append):

Related: #3514

scottyhq commented 3 years ago

thanks for the ping @dcherian, i really like the idea! One other thing that often gets neglected in test suites is operating on remote data. I understand the need to avoid long-running tests and tests prone to network failures for PRs, but running these sorts of examples as a cron job could be very helpful for benchmarking and detecting issues.

In intake-xarray we recently added tests against a local HTTP server and "S3" server: https://github.com/intake/intake-xarray/blob/master/intake_xarray/tests/test_remote.py

Also added several simple tests requiring a network connection to public data (no auth required) that we run locally but not in CI currently: https://github.com/intake/intake-xarray/blob/master/intake_xarray/tests/test_network.py

dcherian commented 3 years ago

Thanks @scottyhq

One other thing that often gets neglected in test suites is operating on remote data.

This is lining up with the "pangeo integration tests" that came up in a Pangeo meeting (cc @rabernat).

Regardless whether it fits, I think adding benchmarks+tests for the xarray+zarr+fsspec (or xarray+mfdataset+netCDF) is an important and unmet need of the Pangeo community in general that we could address.

max-sixty commented 3 years ago

This would be great.

Down a couple of levels — I think potentially we could run this as a cron job on GitHub Actions. NCAR would also be a good plan. I'm also happy to supply a VM if that's helpful.

dcherian commented 3 years ago

Looks like Quansight thinks that GH actions is a good place to benchmark scikit-learn: https://labs.quansight.org/blog/2021/08/github-actions-benchmarks/ so may be we can set that up for our existing benchmarks.

Here's the workflow: https://github.com/jaimergp/scikit-image/blob/main/.github/workflows/benchmarks-cron.yml

dcherian commented 2 years ago

@TomAugspurger are you still in charge of the pydata benchmarking machine? If so, could you add xarray to the list please (https://pandas.pydata.org/speed/)? @Illviljan has made major improvements so it should be a lot faster now

TomAugspurger commented 2 years ago

"In charge of" is overstating it a bit. It's been segfaulting when building pandas and I haven't had a chance to debug it.

If / when I get around to fixing it I'll try adding xarray, but it might be a bit.