Open martinfleis opened 1 year ago
This is something we've been talking about with scverse, especially since we are largely in the lineage of bioconductor and the R ecosystem.
An alternative I think we might be happy with is getting downstream packages to have a CI job that tests against pre-releases (e.g. pip install --pre
). So long as the central packages are good about making pre-releases, actively maintained downstream packages should hit incompatibilities before a general release is made.
As we're also creating a registry of downstream packages in our ecosystem, we could be able to inspect dependent packages recent CI runs (assuming they are using a workflow we provide on github actions) to get some visibility into if errors are being encountered due to our releases.
Astropy is trying to use https://github.com/astropy/astropy-integration-testing and we're about to in a few weeks for real. 🤞
Also see https://github.com/scientific-python/summit-2023/issues/3 for cross-project testing.
Thinking about this in more detail, I think it has three parts.
pytest --pyargs package_name
. In some cases, packages (like geopandas for example) use additional datasets for testing that are not shipped with the package. In the optimal situation, all necessary components are part of the package. However, that is not always possible and we may want to figure out some pytest mark that can be used to filter tests that are not expected to pass in this situation. That may take a form of some SPEC? Or become a part of a related one?Asking downstream packages to test against nightly wheels or main branch is also good to have but it moves the responsibility to those downstream packages. Given some larger packages may have more (or at least some) funding, it may make sense to give a helping hand and test against downstream packages from the upstream one. So imho these two approaches are complimentary.
Got a solution for conda-forge based reverse dependency check
mamba repoquery whoneeds -c conda-forge <package>
It returns all versions of downstream packages but that is easy to filter.
This took a shape of https://github.com/scientific-python/reverse-dependency-testing.
I was recently chatting with some folks developing R packages and learned that when they want to make a release available on CRAN, the package is installed in the environment and the test suites of all downstream packages are ran to ensure that the new release does not come with an unexpected breaking change. While I think this is a bit too much (and it does indeed cause some friction and negativity) I'd like to explore the options how to do the same with Python.
I can imagine we have nightly runs that install a defined set of downstream packages (not one or two, but more twenty or forty) and runs their tests against the current main. While this is already technically feasible, I am not aware of anyone doing it comprehensively and it currently faces the issue of inconsistency of build and CI systems.
We could clone each repo and build from source, but then we need to know how each of the downstream packages get built and prepare for that. Which is non-trivial for a lot of spatial stuff depending on GDAL and friends. However, I don't think we need to test against main but testing against the latest release should be enough. But that comes with the other issue, that if we install the packages from PyPI, the tests are sometimes not a part of the distribution and in other cases will not run, because some data they use are available only in the repo itself.
I don't have a clear idea how to tackle this but I'd love to spend some time thinking about tooling that may enable it. It happened numerous times that we found a regression because a CI of a downstream package failed. This way, we would figure it ourselves.