-
**What happened**: When loading a Parquet file, I specified a column twice in the "columns=" argument and the column was loaded twice, that is, there were two columns in the resulting DataFrame with t…
-
**Is your feature request related to a problem? Please describe.**
I'd like to calculate median and/or quantile on a column after groupbying a dask-cudf data frame.
**EDIT** 5/10/2024: median is …
rnyak updated
6 months ago
-
### Is there already an existing issue for this?
- [X] I have searched the existing issues and there is none for my device
### Product name
Tuya Smart ZigBee Water Timer Sprinkler
### Manufacturer…
-
We currently only support the deprecated query syntax for deletion scopes. It would be more intuitive to specify the deletion scope using the predicate syntax.
Old syntax
```
update_dataset_from_…
-
Dask can perform groupby operations relatively well for columns with low cardinality, but performance seems to degrade significantly for columns with more distinct values.
Let's look at the h2o gro…
-
I would very much like to use this to speed up LSST stack imports... (And potentially `butler` CLI commands?) I'm running into an issue running the code after copying.
```
Traceback (most recent c…
-
```
ddf = dd.from_dict(
{"A": range(8), "B": [1, 1, 2, 2, 3, 3, 4, 4]},
npartitions=4,
)
ddf.to_parquet(tmp_path, engine=engine)
with pytest.raises(ValueError, …
phofl updated
10 months ago
-
I created a ddf and then try to run ddfmanager.sql(“select count(*), column from ddf group by column").
It has the following error msg:
Caused by: org.apache.flink.api.common.typeutils.CompositeType$…
-
This is the given code
life_exp_dataset = pd.read_csv(
"https://raw.githubusercontent.com/open-numbers/ddf--gapminder--systema_globalis/master/ddf--datapoints--life_expectancy_years--by--geo--…
-
**Feature request**
Implement `Catalog.to_dask_dataframe()` which would return `self._ddf.copy()`.
**Before submitting**
Please check the following:
- [x] I have described the purpose of the…