Open ayushdg opened 2 weeks ago
cc: @rjzamora If you could take a look
Is the goal here to raise an error in any case that the user does not explicitly opt out of query-planning? This means all curator users will see this error unless they explicitly set ther config to false ahead of time.
The intended goal is to raise this error whenever Query-Planning
is enabled and dask is going down that route. In newer versions of dask, I've noticed that this value is being set to None
on importing dask.dataframe, but still going down the query planning route.
I now realize that it could be None
in the case where dask.dataframe
is not imported at all. I'll think about this a bit more, but curious if you see better ways of checking whether query planning
is truly enabled.
I now realize that it could be None in the case where dask.dataframe is not imported at all. I'll think about this a bit more, but curious if you see better ways of checking whether query planning is truly enabled.
Right - I believe you have stumbled upon the primary reason we rapids-24.04 was pinned to such an "old" version of dask: There is currently no reliable way (that I know of) to check if the user has already imported dask.dataframe
before dask_cudf
is imported :/
That said, it may be possible for us to add a canonical "switch" in rapids-dask-dependency
.
Description
Importing newer versions of dask (tested with 2024.5) set
dataframe.query-planning
to None instead ofTrue
orFalse
, but theNone
case defaults to using query planning. This PR extends the check toNone
to raise relevant errors w/ newer versions of dask.Usage
N/A
Checklist