-
**Is your feature request related to a problem? Please describe.**
I wish I could use cuML to `train_test_split` a dask_cudf.DataFrame.
**Describe the solution you'd like**
```
from cuml.dask.pr…
-
The following issue describes outstanding TODO items related to the `Canvas.trimesh()` feature (from #525):
## API
- [x] Write utility function (in _ds.utils_) to help generate the data structure …
-
I am looking to either iterate over groups to get names of each group or to have an accessor to the group name in a groupby.apply().
First goal would be an answer like this: [python - Using groupby…
-
This is similar to https://github.com/dask/dask/issues/9879, but smaller in scope.
## Motivation
We've seen several cases where using `pyarrow` strings for text data have significant memory usa…
-
In Pandas 2.0, whenever an empty DataFrame or Series is created, the empty index would be a `RangeIndex` with dtype `int64`. See https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#empty-dataframe…
-
I am using read\_csv() to read a long list of csv files and return two dataframes.
I have managed to speed up this action by using dask. Unfortunately, I have not been able to return multiple variab…
-
I used dask (and xarray) to combine a set of H5py files into a dataframe.
This worked great until I updated dask from 2.28 to 2021.07.1.
If I run the same script now, I always run out of memory, …
-
Xarray already integrate Dask (see: https://docs.xarray.dev/en/stable/user-guide/dask.html).
We could perhaps use Dask dataframes (instead of pandas) e.g., in RegressionModels, to speedup computati…
-
While creating a synthetic dataset with [cuML Dask make_classification](https://docs.rapids.ai/api/cuml/stable/api.html#cuml.dask.datasets.classification.make_classification) to create into a cuGraph …
-
Similar #1498. I think that as the queries are currently written it isn't a fair comparison between DataFrame API's.
For SQL it is fair as the TPCH benchmark states that all engines should parse th…