-
https://github.com/dask/dask/pull/6066 recently added a `DataFrame.shuffle`
method for partitioning a dataframe by one or more columns based on the hash of
the column's values. All rows with an equa…
-
**Describe the issue**:
`KFold.split` doesn't support dask dataframes. With the recent integrations of dask in e.g., xgboost, optuna, it would be very useful if it did. The error message acknowle…
-
Next Demo Day: October 3rd
---
See what the Dask community has been up to, or share some Dask work of your own. Demos are short and informal (~5-10 minutes). Have something you'd like to share? Le…
-
**Feature request**
Implement `Catalog.to_dask_dataframe()` which would return `self._ddf.copy()`.
**Before submitting**
Please check the following:
- [x] I have described the purpose of the…
-
**Describe the bug**
When doing lazy validation on dask dataframes, currently only the first partition to raise a `SchemaErrors` exception is reported.
- [X] I have checked that this issue has n…
-
When #4300 merges it will enable using dask_cudf dataframes to train and predict with the single GPU versions of the models. This should warn, but only when directly using dask_cudf Dataframes, not wh…
-
(Edited by @m-albert)
In the presence of `pyarrow`, dask by default assumes dataframes of type object to be pyarrow strings (see https://github.com/dask/dask/issues/10139#issuecomment-1655928619).
…
-
Dask Arrays mostly support [NEP-18](https://numpy.org/neps/nep-0018-array-function-protocol.html) dispatching via `__array_function__`, meaning you can call NumPy functions like `np.where` on dask Arr…
-
right now the Shapes model is an anndata. We are just saving the coordinates in `adata.obsm["spatial"]` and the metadata in `uns`. I feel this is overkill and we don't really have plans to extend this…
-
**Describe the bug**
A warning on assumption on index aligned is triggered every time `zonal_stats` is called with a dask array (returning a dask dataframe).
This is triggered by ( I believe) this…
giovp updated
11 months ago