Closed jangorecki closed 3 years ago
this can unfortunately cause other issues, for example https://github.com/rapidsai/cudf/issues/3363#issuecomment-562159646
As pointed out by @datametrician we should be also able to use off-vmemory data storage with dask-cudf, then it even make sense to use dask-cudf for a single GPU.
waiting for https://github.com/rapidsai/cudf/issues/2288
Dask seems to be a not mandatory for spilling to main memory. Due to poorly documented setup of dask-cudf this part will be solved separately: https://github.com/h2oai/db-benchmark/issues/129
Without Dask, you still only use 1 GPU instead of both of them.
@datametrician yes, I am aware of it, so the plan is to move to dask-cudf, so this issue stays open.
using dask-cudf
will additionally allows to attempt 1e9 data size by using spil to disk memory feature, as explained in https://github.com/rapidsai/cudf/issues/3740#issuecomment-573091892
https://github.com/rapidsai/cudf/issues/2288 has been finally resolved and it looks we can proceed to using dask_cudf to utilize both GPUs
cudf uses only single GPU, thus it would be useful to employ dask-cudf rather than just cudf. https://blog.dask.org/2019/01/29/cudf-joins