dask-dataframes Search Results

1000+ results
for dask-dataframes

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

AxFoundation/strax #179

Dask dataframe support

It would be nice to support loading data as dask.delayed dataframes (like hax.minitrees.load, or at least somewhat like it), so we can use its parallel and out of core computation functions.

JelleAalbers updated 3 years ago
3
dask/dask #6227

Demonstration dataset for Dask Array

The `dask.datasets` module includes functions like `dask.dataset.timeseries` or `dask.datasets.make_people` for Dask dataframes or Dask bags respectively from random data. It would be useful to hav…

mrocklin updated 4 years ago
5
dask/dask #4876

[QST] Optimized hashing/shuffling

When dataframes are shuffled, dask builds a hash of the index for each partition and buckets the hashes modulo n_partitions. cuDF has an optimized hash partitioning scheme: https://github.com/rapi…

quasiben updated 5 years ago
7
dask/dask #7930

Issue opening h5py arrays

I used dask (and xarray) to combine a set of H5py files into a dataframe. This worked great until I updated dask from 2.28 to 2021.07.1. If I run the same script now, I always run out of memory, …

N4321D updated 1 month ago
6
coiled/benchmarks #1515

Fair dataframe API vs API vs SQL benchmarking.

Similar #1498. I think that as the queries are currently written it isn't a fair comparison between DataFrame API's. For SQL it is fair as the TPCH benchmark states that all engines should parse th…

ritchie46 updated 4 months ago
7
JDASoftwareGroup/kartothek #235

Let merge_datasets_as_delayed merge >2 datasets and filter b…

It would be nice to be able to supply `kartothek.io.dask.delayed.merge_datasets_as_delayed` with a list of `dataset_uuids` to merge an arbitrary number of datasets. This could be implemented by …

mlondschien updated 4 years ago
6
dask/dask #11442

Zetas- Ram crashes dask (GPU/CPU)

Hi, Thank you for your hard work. I am evaluating 2d zetas(twisted with harmonic polynomials): the lemniscate. just sum over x,y for (x^4 - 6. * x^2 * y^2 + y^4)/(x^2+y^2)^4 I can do this in nu…

kh-abd-kh updated 1 week ago
29
ian-whitestone/pyspark-vs-dask #5

Questions

## `r5.xlarge`: Running out of disk space despite having a 50GB EBS volume & 36GB RAM with `cnt = cnt.compute(num_workers=10)` - the two dataframes being joined together are from a 20 GB & 10GB avr…

ian-whitestone updated 6 years ago
3
dask/dask #8528

From Delayed throws exception when column names are out of o…

**Exception** ``` ValueError: The columns in the computed data do not match the columns in the provided metadataOrder of columns does not match ``` **Repro code** ``` from dask.dataframe import …

mlahir1 updated 2 months ago
3
dask-contrib/dask-sql #280

[ENH] Support a "coalesce" repartitioning type hint

In SQL, it's common to work w/ large data and aggregate or filter it down to few enough rows that it could be merged into a single partition in memory. Today you can achieve this with something lik…

randerzander updated 3 years ago
3

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for dask-dataframes

1000+ results
for dask-dataframes