cubed-dev / cubed-benchmarks

Automated benchmark suite for testing the performance of cubed
Apache License 2.0
2 stars 0 forks source link

Run on Lithops AWS #7

Closed tomwhite closed 6 months ago

tomwhite commented 6 months ago

I had to install the following packages (Xarray is pinned due to namedarray changes):

pip install pytest-xdist filelock 'xarray==2023.10.0' cubed_xarray lithops rich pydot s3fs

Then I could run some tests on Lithops:

pytest --benchmark -s 'tests/benchmarks/test_array.py::test_quadratic_means_xarray[configs/lithops_aws.yaml-50]'
pytest --benchmark -s 'tests/benchmarks/test_array.py::test_quadratic_means_xarray[configs/lithops_aws.yaml-500]'
pytest --benchmark -s 'tests/benchmarks/test_array.py::test_quadratic_means_xarray[configs/lithops_aws.yaml-5000]'

Note that Pytest -s option enables the Cubed's Rich progress bars, which is helpful when running a computation interactively.

$ duckdb benchmark.db
D select name, duration from test_run where name like '%lithops%';
┌────────────────────────────────────────────────────────────┬────────────────────┐
│                            name                            │      duration      │
│                          varchar                           │       double       │
├────────────────────────────────────────────────────────────┼────────────────────┤
│ test_quadratic_means_xarray[configs/lithops_aws.yaml-50]   │  34.91558289527893 │
│ test_quadratic_means_xarray[configs/lithops_aws.yaml-500]  │ 56.183167934417725 │
│ test_quadratic_means_xarray[configs/lithops_aws.yaml-5000] │  66.49491095542908 │
└────────────────────────────────────────────────────────────┴────────────────────┘
TomNicholas commented 6 months ago

Wait 66 seconds for t_length=5000?! That's the same as the 150GB job in the blog post notebook, which previously took 11 minutes!

tomwhite commented 6 months ago

Wait 66 seconds for t_length=5000?! That's the same as the 150GB job in the blog post notebook, which previously took 11 minutes!

That's right. This is running on AWS though - I have noticed that GCP can be several times slower (I dont know why), but still a lot faster than when you first ran the benchmarks.

tomwhite commented 6 months ago

I can go and look into this

Thanks!