Integration test suite before release

mrocklin commented 3 years ago

We currently run a test suite on every commit. These tests are designed to be focused and fast.

However, when we release we may want to consider running some larger workflows that test holistic behavior. This might include, for example, reading a 10,000 partition dataset of parquet data from S3, something that is important, but not something that we want to put into our test suite. This might also be a good place to include workflows from downstream projects like RAPIDS and Xarray.

This would be something that the release manager would be in charge of kicking off.

Some things that we would need to do

Figure out if we want to add this process
Build a CI system
Add tests

cc @jacobtomlinson @quasiben @jrbourbeau

martindurant commented 3 years ago

cc @rabernat, who would like to see some similar workflows at large scales in the context of pangeo-forge.

rabernat commented 3 years ago

xref https://github.com/pangeo-data/pangeo-integration-tests/issues/1

brl0 commented 3 years ago

It might make sense to also consider other downstream projects like spatialpandas and dask-geopandas to help catch issues like https://github.com/holoviz/spatialpandas/issues/68 and https://github.com/geopandas/dask-geopandas/issues/49.

BTW, I really appreciate seeing the assistance those projects received to help address those issues, really cool to see that kind of community support. Big thanks to everybody contributing to and supporting this awesome ecosystem.

jsignell commented 3 years ago

It'd be interesting to think about how you pass a test suite like that. For instance, is performance a part of it? It would be very interesting to publish benchmarks with each release. It seems less common that a release actually breaks the read 10_000 files from parquet case, and more common that it introduces a performance regression.

jakirkham commented 2 years ago

Would also suggest adding some Dask projects to this list like Dask-ML, Dask-Image, etc. At least with Dask-ML we have seen a couple breakages recently that probably could have been avoided with integration testing.

jakirkham commented 2 years ago

Might even just be worthwhile to do runs of these tests every 24hrs or so. This can help identify issues a bit sooner than a release giving people more time to fix and update.

Numba did some work in this space that we might be able to borrow from: texasbbq

Also having nightlies ( https://github.com/dask/community/issues/76 ) would help smooth out the integration testing process and aid in local reproducibility

jsignell commented 2 years ago

It looks like @jrbourbeau started getting dask set up with texasbbq a few years back :) https://github.com/jrbourbeau/dask-integration-testing

dcherian commented 2 years ago

Might even just be worthwhile to do runs of these tests every 24hrs or so.

A while ago, I requested a bunch of projects downstream of xarray to run their test suite regularly against xarray HEAD. It has really helped catch issues before release.

Perhaps a bunch of downstream projects can do the same with dask HEAD. Here's the current xarray workflow: https://github.com/pydata/xarray/blob/main/.github/workflows/upstream-dev-ci.yaml It's really nice! It even opens an issue when tests fail with a nice summary.

jakirkham commented 2 years ago

This raises another good point. Maybe it is worth just adding some jobs to the dask-* projects to test Dask + Distributed latest. Idk to what extent these exist now (so feel free to say where this is needed/done). We could then add a cron job as well (especially for some of the more stable dask-* projects) to run overnight with the latest changes. IDK if there is a GH Action to raise cron job failures in an issue, but that might be a good way to raise visibility about anything that breaks overnight

jacobtomlinson commented 2 years ago

IDK if there is a GH Action to raise cron job failures in an issue, but that might be a good way to raise visibility about anything that breaks overnight

Yeah there are definitely ways to raise issues from GitHub Actions. I wonder where a good place to open the issue would be? For projects like dask-kubernetes it might be distributed and for projects like dask-sql it might be dask?

jakirkham commented 2 years ago

Even if they are raised on the projects themselves that could also be useful. Basically just thinking of how we make the CI failure more visible. Red Xs can easily be missed

jsignell commented 2 years ago

We have already copied the xarray upstream infrastructure on dask/dask. There is an upstream action that runs every night and raises an issue with any failures. Here's the yaml for that https://github.com/dask/dask/blob/main/.github/workflows/upstream.yml

dask / community

Integration test suite before release #163