Open AlexeyPechnikov opened 1 year ago
No comment on the performance, but all .fit
methods in dask-ml are eager.
On Tue, Nov 1, 2022 at 2:26 PM Alexey Pechnikov @.***> wrote:
LinearRegression requires even longer time than sklearn version and it doesn't return a lazy object:
%%time
from sklearn.pipeline import make_pipeline from dask_ml.linear_model import LinearRegression
size = 1e6
X = dask.array.arange(2*size).reshape(-1,2) y = dask.array.arange(size).reshape(-1,1) reg = LinearRegression() reg.fit(X, y)
[image: image] https://user-images.githubusercontent.com/7342379/199320461-14182b29-9156-4875-b678-e3cdec294976.png
It looks as a Dask-incompatible function.
— Reply to this email directly, view it on GitHub https://github.com/dask/dask-ml/issues/949, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOITZABPTTCXRS5SQULLWGFVHVANCNFSM6AAAAAARULWTBE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@TomAugspurger Do you mean dask-ml has no any advantages and it’s slower vs sklearn? Obviously, we can’t select and process just a subset of data later when dask-ml methods are not lazy.
LinearRegression requires even longer time than sklearn version and it doesn't return a lazy object:
It looks as a Dask-incompatible function.