Open keyword1983 opened 2 years ago
Dask estimators have massive overhead because of indispensable inter-GPU transfers and are only ever useful when workers are given sufficient workload. Then when in a presence of a sufficiently large dataset worth distributing, the compute time can be reduced through the use of NVLink and Infiniband for faster transfers when available.
Also, fit_transform
is supposed to execute lazily and only start the actual work on the compute
call. The 23.5s execution time could be explained by the cuDF operation.
Hi i have a try with dask pca, but i got result is weird.
my env : NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 rapids: 22.04
below is my sample code: this is dask PCA with two 1080ti and fit_transform execution time was 23.5 s XT.compute() was 5.87
this is cuml PCA without dask and one 1080ti fit_transform execution time was 3.63 s
My question is , is it supposed dask pca with two 1080ti should faster then pca with one 1080ti?
PS: I did this experiment with v100 gpus got same reault.