EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.75k stars 1.57k forks source link

TPU support for TPOT? #1145

Open neel04 opened 3 years ago

neel04 commented 3 years ago

Is there any way to run TPOT on TPU? I didn't find any info regarding this in the docs, or in the ReadMe. Can anyone throw some light on this issue?

weixuanfu commented 3 years ago

Currently, TPOT do not support TPU.

neel04 commented 3 years ago

Alright. Would you also happen to know how much time does TPOT take on a regression problem (Just a rough estimate) on a normal GPU like V100 or a P100? @weixuanfu

weixuanfu commented 3 years ago

I suppose that you are using the "TPOT cuML" for using GPU-accelerated estimators in RAPIDS cuML and DMLC XGBoost. Unfortunately, I do not know if RAPIDS cuML supports V100 or P100. I only have very limited experience on using it on regression problem. I have tested "TPOT cuML" on 2080Ti before with a regression benchmark with 50000 samples and 50 features and it took 1-2 days to finish 100 generations with 100 population size and cv=5.

neel04 commented 3 years ago

Is it compulsory to use TPOT cuML for GPU acceleration, or would the vanilla "pip" install use GPU anyways?

weixuanfu commented 3 years ago

No, it is an optional config. Please check this installation guide for TPOT cuML.

carterrees commented 3 years ago

Follow up question. Is it possible to use multiple gpu's while training tpot?

beckernick commented 3 years ago

@carterrees yes, you can do this by starting a Dask CUDA cluster and setting use_dask=True. This brief recording shows an example. https://www.youtube.com/watch?v=7z4OJQdY_mw

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
cluster = LocalCUDACluster() # use every GPU on the machine by default
client = Client(cluster)
...
# TPOT as normal, passing use_dask=True and config_dict="TPOT cuML"
...