SMART-Lab / smartdispatch

An easy to use job launcher for supercomputers with PBS compatible job manager.
Do What The F*ck You Want To Public License
34 stars 18 forks source link

GPU module CUDA forced load #177

Open bouthilx opened 7 years ago

bouthilx commented 7 years ago

We faced a problem with the default module in GPU queues on Cedar. I added a CUDA module in the Cedar config just it's done for others clusters (helios example), but the our lab stack CUDA is conflicting with the module loaded by SmartDispatch. I understand adding the module to queues is convenient because most of the people need that to use the GPU, but what if the user wants something else? In best case, module is loaded uselessly, but in worst case it conflicts with the users' one.

I would suggest that we remove all cuda modules from configuration files. Users already need to setup their environments, loading CUDA should be part of it. We could add a temporary check for CUDA when a GPU is requested to alert users that they requested a GPU but did not load a cuda module. That would only be temporary, to ease the transition for users relying on module load CUDA.