-
We've run into some performance issues when running `ddf_utils.package.create_datapackage()`. We have some files with hundreds of thousands of entities and running this function takes a very long time…
-
| --- | --- |
| Bugzilla Link | [567352](https://bugs.eclipse.org/bugs/show_bug.cgi?id=567352) |
| Status | UNCONFIRMED |
| Importance | P3 normal |
| Reported | Sep 25, 2020 08:11 EDT |
| Modifi…
-
https://github.com/discoproject/disco/wiki/DDFS-Evolution
This page was last edited on June 14, 2012.
Is this information out of date or is some of it still applicable to the current day DDFS?
…
ghost updated
7 years ago
-
```
ddf = dd.from_dict(
{"A": range(8), "B": [1, 1, 2, 2, 3, 3, 4, 4]},
npartitions=4,
)
ddf.to_parquet(tmp_path, engine=engine)
with pytest.raises(ValueError, …
phofl updated
8 months ago
-
Hi @trxcllnt & team!
Curious if any pointers on this. Ultimately, we're trying to setup a basic ~LRU over Python-managed dask-cudf objects (raw ddf or published) and have node do reads+writes with …
-
Hello everyone,
We are facing a problem when calling dd.get_dumies (or DummyEncoder) when using Categorizer to infer the categories.
The problem seems to arise when two columns have the same cat…
-
I'm trying to process parquet files stored in AWS S3.
The files are read simply with
```python
with Client(n_workers=6) as client:
df = dd.read_parquet('s3://lightnings_*.gzip.parquet')
…
-
We currently only support the deprecated query syntax for deletion scopes. It would be more intuitive to specify the deletion scope using the predicate syntax.
Old syntax
```
update_dataset_from_…
-
I'm trying to do some 3GC processing for a MeerKAT field and am coming across an issue when running DDF.py, which appears to be a problem with the output generated from MakeModel.py.
The command be…
-
As a spatial join workaround, I wanted to use `map_partition` and run the spatial join (`geopandas.tools.sjoin`) on each partition and aggregate later down the line. Each partition produces an invalid…