clEsperanto / pyclesperanto_prototype

GPU-accelerated bio-image analysis focusing on 3D+t microscopy image data
http://clesperanto.net
BSD 3-Clause "New" or "Revised" License
211 stars 46 forks source link

Default GPU selection #270

Open jo-mueller opened 1 year ago

jo-mueller commented 1 year ago

Hi @haesleinhuepf ,

following up on our discussion from recently, I was wondering how exactly cle selects the default used GPU? I had a look on two PCs (both windows 10).

One is equipped with an RTX 3060 Ti which cle correctly uses by default. On the other PC, I have an integrated Intel UHD Graphics device as well as an GTX 1650 Ti - on this machine, cle selects the built-in GPU rather than the dedicated one. Is there an option to - for instance - use the GPU with more dedicated memory by default?

StRigaud commented 1 year ago

Correct me if I am wrong but the default device is select as follow:

  1. list all possible device
  2. associate score for each device
  3. sort device per score
  4. select the first device of the list (with best score).

The score is the key element can be defined but by default it is the device with the highest memory capacity. See this piece of code.

This explain why built-in GPU usually are select first as they are associated with the computer memory which beats the dedicated GPU cards by far.

haesleinhuepf commented 1 year ago

Interesting discussion! So the actual listing/sorting happens here: https://github.com/clEsperanto/pyclesperanto_prototype/blob/006379e7302cc0f76a571b5cdf3fb3962c9461c9/pyclesperanto_prototype/_tier0/_device.py#L107-L120

And it appears that the scoring function is None: https://github.com/clEsperanto/pyclesperanto_prototype/blob/006379e7302cc0f76a571b5cdf3fb3962c9461c9/pyclesperanto_prototype/_tier0/_device.py#L48

The entire code section looks a bit awkyard btw. However: https://en.wiktionary.org/wiki/never_change_a_running_system

@jo-mueller , I often write cle.select_device('TX') to select an NVidia GPU. If it's not found, it falls back to whatever is available. Might that be a suitable workaround for you?

StRigaud commented 1 year ago

So if the sorting function is None that's means that the order of device listed is (kind of) random?

haesleinhuepf commented 1 year ago

It's the operating system setting the order. In CLIJ it's also like that

jo-mueller commented 1 year ago

@jo-mueller , I often write cle.select_device('TX') to select an NVidia GPU. If it's not found, it falls back to whatever is available. Might that be a suitable workaround for you?

Yes and no. I stumbled over the problem specifically when I use cle within distributed dask jobs. To be more precise:

If I - for instance in a notebook - do

import pyclesperanto_prototype as cle
cle.select_device('GTX')

results = some_function_that_uses_cle(input)

I see that the GPU is indeed used correctly. If I do

import pyclesperanto_prototype as cle
from dask.distributed import Client

cle.set_device('GTX')

client = Client()
client.submit(some_function_that_uses_cle, input)

I see that cle uses the default (integrated) GPU and not the one that I imported at the top of the script. This is probably a dask-specific question but it makes sense to me that global variables (like the selected GPU) are not passed into the scope of a dask-serialized job.

I guess I could just add a variable to the keywords of my functions to be executed so that some_function_that_uses_cle selects the correct GPU according to a passed variable, but I was hoping that I could somehow modify the default behavior :)

haesleinhuepf commented 1 year ago

Hey @jo-mueller ,

I'm not sure if I understand your use case. When I execute the code you provided, I receive this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 4
      1 import pyclesperanto_prototype as cle
      2 from dask.distributed import Client
----> 4 cle.set_device('GTX')

AttributeError: module 'pyclesperanto_prototype' has no attribute 'set_device'

Feel free to create a minimal working example notebook so that I can try.