Open christhorn2 opened 3 months ago
Describe the issue:
API Documentation of dask train_test_split states that blockwise=False is supported for Arrays: "For Dask Arrays, set blockwise=False to shuffle data between blocks as well." https://ml.dask.org/modules/generated/dask_ml.model_selection.train_test_split.html#dask_ml.model_selection.train_test_split
This is the intention of the code too I think, and it delegates the job to ShuffleSplit: https://github.com/dask/dask-ml/blob/567cfd7837c7616fd352e0efbcfcee42f351199c/dask_ml/model_selection/_split.py#L490
However, ShuffleSplit does not support blockwise=False:
https://github.com/dask/dask-ml/blob/567cfd7837c7616fd352e0efbcfcee42f351199c/dask_ml/model_selection/_split.py#L194
Minimal Complete Verifiable Example:
from dask_ml.model_selection import train_test_split import dask.array as da x = da.arange(8, chunks=4) train_test_split(x,blockwise=false) .... NotImplementedError: ShuffleSplit with blockwise=False has not been implemented yet.
blockwise=False
Environment:
hey @christhorn2 , can i work on this issue?
Describe the issue:
API Documentation of dask train_test_split states that blockwise=False is supported for Arrays: "For Dask Arrays, set blockwise=False to shuffle data between blocks as well." https://ml.dask.org/modules/generated/dask_ml.model_selection.train_test_split.html#dask_ml.model_selection.train_test_split
This is the intention of the code too I think, and it delegates the job to ShuffleSplit: https://github.com/dask/dask-ml/blob/567cfd7837c7616fd352e0efbcfcee42f351199c/dask_ml/model_selection/_split.py#L490
However, ShuffleSplit does not support blockwise=False:
https://github.com/dask/dask-ml/blob/567cfd7837c7616fd352e0efbcfcee42f351199c/dask_ml/model_selection/_split.py#L194
Minimal Complete Verifiable Example:
from dask_ml.model_selection import train_test_split import dask.array as da x = da.arange(8, chunks=4) train_test_split(x,blockwise=false) .... NotImplementedError: ShuffleSplit with
blockwise=False
has not been implemented yet.Environment: