pangeo-data / storage-benchmarks

testing performance of different storage layers
Apache License 2.0
12 stars 1 forks source link

Update: April 15, 2018 #27

Open kaipak opened 6 years ago

kaipak commented 6 years ago

Getting KubeCluster/Dask/GCP tests to work properly took a little more elbow grease than I expected, but I'm getting consistent results finally. Previously published tests from my forked repo have been updated with this more comprehensive set. Bear in mind, these tests can take a pretty long time to run given that they conduct lots of runs in order to be statistically meaningful, so many of them are truncated at the moment and may display weird results. Longer tests are currently running on GCP and my laptop and will be uploaded whenever they finish.

https://kaipak.github.io/storage-benchmarks/#/

Here's what we've written so far:

Due to problems I ran into with getting consistent runs in ASV with the plethora of pieces we're dealing with, I didn't get around to documentation or prettying up the plots as I had planned, but will be focusing on that for the next couple days. ASV docs are also quite sparse, so I think it'll be worthwhile have something more comprehensive here--especially since the behavior of some of its settings is not necessarily obvious.

I'd like to more fully detail the tests we have so far and what we plan on working on next. There is no set schedule per se, but I've been roughly favoring getting results out of Dask/Xarray/GCP. In the immediate future, I plan on writing tests that use real data (likely, LLC4320 ccean general circulation simulation output). Since we have all these tests now working for synthetic data, is should be relatively straightforward pointing to actual datasets. Here's my rough idea of a schedule in the next round of test writing.

If there's a particular use case someone is dying to see, I'd be happy to take requests.

rabernat commented 6 years ago

Kai, this is great progress!

As discussed in our meeting today, here are some next priorities:

rabernat commented 6 years ago

And be careful about load vs persist!