alteryx / woodwork

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
https://woodwork.alteryx.com
BSD 3-Clause "New" or "Revised" License
145 stars 20 forks source link

CI tests need to be updated to run with latest versions specified in core requirements #1738

Closed thehomebrewnerd closed 6 months ago

thehomebrewnerd commented 1 year ago

Based on the current library installation flow, we can end up in a situation where we don't test with the most recent versions of libraries installed, if additional restrictions are present in the dask or spark extras that are not part of the core requirements.

Take this run of the latest dependency checker for example. This was triggered by a new version of pandas (2.0.3).

However, if you follow the installation process in the logs you see that during the Install woodwork - requirements step, pandas 2.0.3 and numpy 1.25.1 both get installed. Further down, we run Install Dask and Spark. Because the Spark requirements have an upper bound restriction on both pandas and numpy, we end up downgrading versions for these libraries to 1.5.3 and 1.23.5, respectively, before the tests are started.

The end result of this is that even though this PR suggests that all the tests pass with the latest version of pandas, we didn't actually run the CI with the newest version that triggered the PR. This should be fixed so that our tests contain a run with only the core requirements (no dask and spark extras) to make sure that the latest version of things specified in the core requirements are actually run.