StanfordLegion / legion

The Legion Parallel Programming System
https://legion.stanford.edu
Apache License 2.0
669 stars 146 forks source link

Realm: Crashes on WSL #1720

Closed RAMitchell closed 1 month ago

RAMitchell commented 1 month ago

When attempting to run a legate program with GPU on windows subsystem for linux I get the following crash

[0 - 7f7923ad5000]    0.000000 {5}{gpu}: /tmp/conda-croot/legate_core/work/arch-conda/_skbuild/linux-x86_64-3.12/cmake-build/_deps/legion-src/runtime/realm/cuda/cuda_module.cc(4426):NVML_FNPTR(nvmlDeviceGetMemoryAffinity)( info->nvml_dev, info->MAX_NUMA_NODE_LEN, info->numa_node_affinity, NVML_AFFINITY_SCOPE_NODE) = 3

Return code 3 corresponds to NVML_ERROR_NOT_SUPPORTED = 3 The requested operation is not available on target device. The function nvmlDeviceGetMemoryAffinity is not supported for Windows.

When running legate with CPU only the program completes without issue.

The workaround would be to provide a fallback when this API function is not supported. There also may be other linux only functions in use, this was just the first one I hit.

muraj commented 1 month ago

https://gitlab.com/StanfordLegion/legion/-/merge_requests/1396 out for review, should resolve the WSL issues as tested locally.

RAMitchell commented 1 month ago

Thank you!

muraj commented 1 month ago

e386769bebe322d223cda0b59a34d7bf0e04e2f9 was merged in, moving to close. Please verify and reopen if this is still an issue. Thank you!