FenTechSolutions / CausalDiscoveryToolbox

Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html
MIT License
1.08k stars 198 forks source link

[BUG] autoset_settings() fails with MIG GPU #156

Open btravouillon opened 1 year ago

btravouillon commented 1 year ago

With MIG GPU, the value of CUDA_VISIBLE_DEVICES is a string.

This breaks the code at https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/d0bc352534dcbfac19a84a1bb05f33fe311378d2/cdt/utils/Settings.py#L152

This is reproducible with an NVIDIA A100 GPU with a MIG:

$ echo $CUDA_VISIBLE_DEVICES 
MIG-d5385a41-e608-5c4f-92c5-0932c7b636cd
$ python3.10 -c 'import ast, os; devices = ast.literal_eval(os.environ["CUDA_VISIBLE_DEVICES"]);'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/cvmfs/ai.mila.quebec/apps/arch/distro/python/3.10/lib/python3.10/ast.py", line 64, in literal_eval
    node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
  File "/cvmfs/ai.mila.quebec/apps/arch/distro/python/3.10/lib/python3.10/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 1
    MIG-d5385a41-e608-5c4f-92c5-0932c7b636cd
                      ^
SyntaxError: invalid decimal literal

Note that the string format may vary depending on the NVIDIA driver (=R470), see https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#cuda-visible-devices