-
We've been working on adding a Query optimization layer to Dask DataFrames for a while now. The project live at https://github.com/dask-contrib/dask-expr
The status quo can be summarised as follo…
phofl updated
8 months ago
-
Hi folks, I'm Devin from the Modin group.
Between our groups, I believe there's a high potential for collaboration and improving the data science experience for Dask and Modin users. Due to the com…
-
When I concat DateTimeIndexed dataframes together, that have overlapping partitions, dask generates a large number of repartition-merge , repartition-split tasks. With 400 dataframes I have seen it g…
-
In PR ( https://github.com/dask/dask/pull/4708 ), we chose to keep using `__array_wrap__` instead of `__array_function__` for dataframes. This is a reasonable choice as currently dataframe libraries d…
-
Hi developer.
Thanks alot for the useful tools and sorry for keep bothering.
I need a 3D visualiza on my xenium dataset. Now I can correctly load my xenium data and check the transcripts point…
-
I installed the basic dask version using "pip install dask". When running, I receive a FutureWarning:
> Dask dataframe query planning is disabled because dask-expr is not installed. You can install …
-
I´ve just opened this issue in the dask repo, but maybe here is better...
I´m using dask for implementing a data pipeline with dask dataframes and dask ml in a Yarn Cluster.
When I build an XGBo…
-
What would it look like to parallelize/distributed GeoPandas with Dask?
Dask has created parallel variants of NumPy arrays and Pandas Dataframes. I think it may be sensible to do the same with Geo…
-
**Exception**
```
ValueError: The columns in the computed data do not match the columns in the provided metadataOrder of columns does not match
```
**Repro code**
```
from dask.dataframe import …
-
We recently added a `dataframe.dtype_backend` config option for specifying whether classic `numpy`-backed dtypes (e.g. `int64`, `float64`, etc.) or `pyarrow`-backed dtypes (e.g. `int64[pyarrow]`, `flo…