Open kbafna-antuit opened 4 years ago
@KeertiBafna as of now, there is no support. I will convert this into a feature request for future work
@divyegala Thank you for the response. If i have two gpu's on my machine, is there any way to run tasks in parallel on them? Say task1 on gpu-0 and task2 on gpu-1 in parallel?
@KeertiBafna that can be achieved through the projects https://github.com/rapidsai/dask-cuda and https://dask.org/
Just a small code snippet to get you started:
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster()
client = Client(cluster)
workers = list(client.has_what().keys())
client.run(func1, workers=[workers[0]])
client.run(func2, workers=[workers[1]])
LocalCUDACluster()
creates a GPU cluster with the number of GPUs available in your system, controlled by the environment variable CUDA_VISIBLE_DEVICES
. Client(cluster)
creates a dask.Client
object which can then help you schedule and run functions on your GPUs in either parallely, async or lazy (depends on your choices). client.run
here runs two functions parallelly in async on two workers (GPUs). You could very well replace it with model1.fit
and model2.fit
. Let me know if this answers your question
Thanks @divyegala. This is what i wanted to know. To run 2 functions in async, do i need to specify some parameter ? I tried the above code, i have 2 workers and i initiated 2 model objects (ESM and ARIMA), est_esm = ExponentialSmoothing(grp_df_1, ts_num=grp_df_1.shape[1]) est_arima = ARIMA(grp_df_1, order=(1,0,1), seasonal_order=(1,0,1, 52)) Running the client like above is showing GPU utilisation on only 1 worker, i.e. client.run(est_esm .fit(), workers=[workers[0]]) client.run(est_arima.fit(), workers=[workers[1]])
Is this the right way or am i missing something ? P:S i want to fit both the models in parallel, one on each GPU worker.
@KeertiBafna that was my fault. I think it's not client.run
, it is client.submit
. You should be able to just change the function name without having to change the arguments. Let me know if that works. To confirm, both your estimators are cuml models, right? Also, can you print the result of client.has_what()
for me if the above substitution does not work?
@divyegala Yes, both are cuml models. Below are the details: grp_df1 ==> cuDF dataframe with ts as columns ARIMA ==> cuml.tsa.arima ExponentialSmoothing ==> from cuml.tsa.holtwinters client.has_what() = {'tcp://127.0.0.1:33577': (), 'tcp://127.0.0.1:45575': ()}
The command i run is ==> client.submit(est_esm .fit(), workers=workers[0])
Even if i change workers[0] to workers[1] i always see my GPU 0 being utilised.
Could you help ?
@KeertiBafna could you show me your workers
variable? Also, can you try workers=[workers[0]]
(putting it in a list)
@KeertiBafna oh of course, I just realised what was wrong. The model was created on the client
. Ideally, you would have something like:
def fit1():
esm = ExponentialSmoothing()
esm.fit()
client.submit(fit1, workers=[workers[0]])
What this changes is creating the model on the worker and then fitting it there too. Can you try this way?
@divyegala This worked . Thank you very much :)
The code thatworked just FYI: from cuml.tsa.arima import ARIMA def fit1(): est = ARIMA(grp_df_1, order=(1,0,1), seasonal_order=(1,0,0,52)) est.fit() frcs = est.forecast(6) return frcs def fit2(): est2 = ARIMA(grp_df_2, order=(1,0,1), seasonal_order=(1,0,0,52)) est2.fit() frcs = est2.forecast(6) return frcs
a = client.submit(fit1, workers=[workers[0]]) b = client.submit(fit2, workers=[workers[1]]) t0 = time.time() res_a = a.result() res_b = b.result() print('Time taken(sec): ', round(time.time() - t0, 4))
Awesome! This might have upped both your GPU utilization, but I believe by calling a.result()
, you block your client to wait on a's results before reading b. To keep it truly async, you could do dask.distributed.wait([a, b])
which means you won't block the client on a anymore and it will wait for both a and b to fully finish and have the results ready. After which, you could do a.result()
and b.result()
non-blocking! :)
@divyegala Yes, i did notice that. When i tried to add "dask.distributed.wait([a, b])" after defining a and b, the time taken overall was surprisingly a bit more. Below is the modified code, is there something i missed?
a = client.submit(fit1, workers=[workers[0]]) b = client.submit(fit2, workers=[workers[1]]) t0 = time.time() dask.distributed.wait([a, b]) res_a = a.result() res_b = b.result() print('Time taken(sec): ', round(time.time() - t0, 4))
@KeertiBafna I would think that would be very use-case dependent. How big are the models that you are fitting? I would say dask.distributed.wait
would start showing benefits as the run-time needed to fit both a and b is large enough
@divyegala Currently i was trying on a subset. I will surely try it full scale and update you. Anyways thank you very much for the help.
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.
Is there any way currently to run the rapids time series models (ARIMA, ESM) on multiple gpus ?