-
The new `ColumnTransformer` (not merged yet, https://github.com/scikit-learn/scikit-learn/pull/9012) would also be useful in `dask_ml` context when using estimators that can handle dask objects, and i…
-
Hello
I start to use dask- sql but I cant make any simple query, I can just make a total selection with `select * from df;`. Beside this query I cant do anything else, in every query I get the sa…
-
Sometimes it's not clear when running a number of SQL scripts in a session whether all scripts "clean up" after themselves (dropping temp tables, unpersisting tables, de-registering UDFs, etc).
It …
-
In dask dataframes, the partitions should be ordered so that the hipscat index increases between them. In some methods such as polygon and cone filtering, the partitions are ordered by the partition_i…
-
From @rabernat on [Twitter](https://twitter.com/rabernat/status/1330707155742322689):
> "Xarray has some secret private classes for lazily indexing / wrapping arrays that are so useful I think they…
-
After spending some time working out why my groupby operation was not working I came across https://examples.dask.org/dataframes/02-groupby.html#Many-groups. If It wasn't for the great docs around das…
-
```
from dask_sql import Context
import pandas as pd
import dask.dataframe as dd
c = Context()
pd.DataFrame({'id': [0, 1, 2]}).to_parquet('/data/test/part.0.parquet')
# this works
c.sql("…
-
The latest release `2024.3.0` enabled query planning for `DataFrame`s by default. This issue can be used to report feedback and ask related questions.
If you encountered a bug or unexpected behavio…
-
See the discussion at https://github.com/coiled/coiled-examples/pull/1#discussion_r461243952. Ideally we'd be able to pass dask arrays / dataframes to `HyperbandSearchCV.score`. We'd like for `Hyperba…
-
Dask dataframes are missing the "reindex" function. Would be great to support it, as it's a useful primitive for time series analysis. I think limiting the support to just sorted Int64Index or DateT…