-
Currently Pandas serializes views of ArrowStringArrays by serailizing the whole thing, rather than a subset. Here is an example:
```python
In [1]: import pandas as pd
In [2]: s = pd.Series([c …
-
***Note**: This issue was originally created as [ARROW-376](https://issues.apache.org/jira/browse/ARROW-376). Please see the [migration documentation](https://gist.github.com/toddfarmer/12aa88361532d2…
-
***Note**: This issue was originally created as [ARROW-376](https://issues.apache.org/jira/browse/ARROW-376). Please see the [migration documentation](https://gist.github.com/toddfarmer/12aa88361532d2…
-
In the course of [setting up continuous integration](https://github.com/catalyst-cooperative/rmi-ferc1-eia/issues/151) in the `rmi-ferc1-eia` repository, we discovered that the current plant part list…
-
Today, we support the `applymap` interface on Series but not DataFrames. Pandas supports `applymap` on DataFrames but not Series. In pandas, the interface provides applies a scalar function/UDF to eve…
-
I wish I could join a large cuDF with a small series/list/sequence in terms of full join in sql, or even better with the small series/list being broadcast for the full join like in spark sql, while th…
-
I'm using Dask + Datashader over here: https://github.com/mrocklin/dask-tutorial/blob/main/2-dataframes-at-scale.ipynb
I'm finding that I'm spending around 20s serializing things, this is mostly in…
-
Repro:
```
import pandas as pd
from dask_sql import Context
c = Context()
df = pd.DataFrame({"id": [0, 1, 1, 2], "val": [1, 1, 2, 1]})
c.create_table("df", df)
c.sql("""
SELECT
val,
…
-
Here are some propositions as discussed in https://github.com/pangeo-data/foss4g-2022/pull/45.
Please indicate whether it's OK for you (especially @tinaok):
- [x] Add a little part on Dask Clust…
-
**Describe the bug**
AttributeError occurs when I use groupby...apply to dask dataframe.
> AttributeError: 'SeriesGroupBy' object has no attribute 'apply'
**Steps/Code to reproduce bug**
`from…