Closed rjzamora closed 5 months ago
@rjzamora adding the following env variable to both ./ci/test_python.sh
and ci/test_wheel.sh
should resolve the tests for now:
ci/test_python.sh
DASK_DATAFRAME__QUERY_PLANNING=False pytest \
2. `ci/test_wheel.sh`
```bash
DASK_DATAFRAME__QUERY_PLANNING=False python -m pytest -n 8 ./python/cuxfilter/tests
Thanks @AjayThorve !
Just to clarify: My primary goal here is to find/fix the problems with DASK_DATAFRAME__QUERY_PLANNING=True
. Therefore, It's fine to switch of query-planning to unblock CI, but in this particular PR, I'd like to make sure query-planning is enabled.
Thanks @AjayThorve !
Just to clarify: My primary goal here is to find/fix the problems with
DASK_DATAFRAME__QUERY_PLANNING=True
. Therefore, It's fine to switch of query-planning to unblock CI, but in this particular PR, I'd like to make sure query-planning is enabled.
okay, that makes sense! Thanks for the PR!
I am only seeing a failure in one test now (cuxfilter/tests/charts/datashader/test_graph_assets.py::TestGraphAssets::test_calc_connected_edges
).
As far as I can tell, we are comparing the following output DataFrame:
x y color
0 1.0 1.0 11.0
1 1.0 1.0 11.0
2 NaN NaN NaN
3 1.0 1.0 10.0
4 0.0 0.0 10.0
5 NaN NaN NaN
...to an "expected" DataFrame:
x y color
0 1.0 1.0 10.0
1 0.0 0.0 10.0
2 NaN NaN NaN
3 1.0 1.0 11.0
4 1.0 1.0 11.0
5 NaN NaN NaN
Before I dig in too far here, is it possible that both results are correct? That is, does the order of the rows matter here, or just the relationship between "x"
, "y"
and "color"
? The new dask_cudf
API now uses more upstream algorithms than the legacy API did. So, the ordering of some results may be different.
@rjzamora should I go ahead and merge this? Lgtm.
should I go ahead and merge this? Lgtm.
Seems fine to me as long as you think the test_graph_assets.py
change makes sense :)
/merge
This PR is intended to resolve test failures related to the recent dask-expr migration in
dask.dataframe
. I noticed that cuxfilter was importing fromdask_cudf.core
(no longer allowed when query-planning is enabled). There may be other issues as well.