-
There are two things dask does differently when computing std and mean for timedelta columns.
For dataframe timedelta columns are dropped and for series different dtype is returned (float64 or numpy.…
-
Problem:
see https://github.com/open-numbers/ddf--gapminder--co2_emission/issues/1
semio updated
7 years ago
-
[read_dataset_as_ddf](https://github.com/JDASoftwareGroup/kartothek/blob/master/kartothek/io/dask/dataframe.py) has confusing signature. It is not clear which arguments are required and which are not.…
DD5HT updated
5 years ago
-
The first example people see when reading about dask-geopandas is actually an [anti-pattern](https://docs.dask.org/en/latest/best-practices.html#load-data-with-dask) that dask encourage us not to use.…
-
The new Dask multi-GPU logistic regression should convert dtypes if needed rather than fail due to input data structures not adhering to the dtype expectations of the C++ implemenatation.
The exact…
-
![ddm](https://user-images.githubusercontent.com/579018/206372141-0684cdbd-710e-4ef3-a9b8-28453748e803.gif)
![ddf](https://user-images.githubusercontent.com/579018/206372142-377942b7-fff5-4929-88c0-5…
-
The DDF team has identified some generic checking requirements which might be best implemented through the checking of USDM JSON data against a defined USDM schema (e.g., JSON-Schema). These checks i…
-
**Describe the issue**:
In JSON format, a key with no value associated with it can be represented as `"key": null`, or by removing the entire key. Both forms seem to be commonly used.
When r…
-
Possibly not an issue and I'm being stupid...but don't see a way to have some attribute of svychisq (e.g. p.value) returned for each group_by selection (presumming the group_by generates a series of r…
-
I'm trying to process parquet files stored in AWS S3.
The files are read simply with
```python
with Client(n_workers=6) as client:
df = dd.read_parquet('s3://lightnings_*.gzip.parquet')
…