Open VibhuJawa opened 2 years ago
Tried to address the dask-sql
bit with PR https://github.com/dask-contrib/dask-sql/pull/330 .
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
This issue has been labeled inactive-90d
due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
Describe the bug
With PR we enabed training single GPU cuML models using Dask DataFrames and Series but we use
compute
there which brings data to the client.This causes the following issues:
rmm-pool
on workers not leaving enough memory on workers causing OOMSteps/Code to reproduce bug
Example with
dask-cudf
Trace:
Example with
dask-sql
Expected Behaviour:
I expect this to succeed like if we were to do this with
cuDF
dataframes.CC: @dantegd , @ChrisJar
Expected Solution
Unsure where we should push a fix for this.
For the
dask-sql
case it might be a better to fix it indask-sql
and train there via amap_partitions
call directly and just error/warn if stand alonedask-cuDF
.