Open VibhuJawa opened 2 years ago
For query 5 it appears that using sklearn as a direct replacement for cuml is slightly faster than adjusting to use dask-ml: | Run | Sklearn | Dask-ml |
---|---|---|---|
1 | 1731.960032 | 1976.639929 | |
2 | 1713.143504 | 1890.307189 | |
3 | 1692.447222 | 1819.198046 | |
4 | 1679.160072 | 1800.853525 | |
5 | 1663.727669 | 1791.983971 | |
Avg | 1696.0877 | 1855.796532 |
Edit: Here are times running on a dgx-2 | Run | Sklearn | Dask-ml |
---|---|---|---|
1 | 605.7754374 | 712.7177153 | |
2 | 609.4057972 | 703.8873169 | |
3 | 592.3652494 | 705.2219992 | |
4 | 589.4770317 | 704.7177913 | |
5 | 589.8500378 | 698.2876835 | |
Avg | 597.3747107 | 704.9665012 |
Edit 2: Here are times running on 2 dgx-1s (TCP) | Run | Sklearn | Dask-ml |
---|---|---|---|
1 | 865.8754275 | 984.3859689 | |
2 | 833.6778433 | 968.5142105 | |
3 | 814.666688 | 939.6765635 | |
4 | 823.4441831 | 925.5529888 | |
5 | 806.8892348 | 929.7718291 | |
Avg | 828.9106753 | 949.5803122 |
@ChrisJar , Thanks for sharing these benchmarks. Do you have thoughts on how this can change if we scale to 10K. Not saying we should prioritize that, just wondering if you have any thoughts on that front ?
Below queries rely on cuML models from for ML GPU . Depending on the performance we need to decide b/w Distributed (dask-ml) vs non distributed (sklearn) implementation for the ML portion of these queries. I suggest benchmarking both and then choosing the one that gives the best performance.
Query-05 GPU:cuml.LogisticRegression
Query-20 GPU: cuml.cluster.kmeans
Query-25 GPU: cuml.cluster.kmeans
Query-26 GPU: cuml.cluster.kmeans
Query 28 GPUcuml.dask.naive_bayes
Distributed CPU CPU Equivalent dask_ml.naive_bayes
CC: @DaceT , @randerzander
Related PRS:
https://github.com/rapidsai/gpu-bdb/pull/243
https://github.com/rapidsai/gpu-bdb/pull/244