Open qy0720 opened 1 month ago
PYTHONPATH: :/opt/tiger/arnold/arnold_entrypoint:/usr/bin/srun:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/load:/opt/tiger/studio_loader:/opt/tiger/arnold_toolbox:/opt/tiger/rh2:/opt/tiger/rh2:/opt/tiger/pyutil:/python:/python/lib/py4j-0.10.9-src.zip:/opt/tiger/rh2:/opt/tiger/load:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/pyutil:/opt/tiger/arnold/arnold_entrypoint:/opt/tiger/studio_loader which python: /usr/bin/python PYTHONPATH: :/opt/tiger/arnold/arnold_entrypoint:/usr/bin/srun:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/load:/opt/tiger/studio_loader:/opt/tiger/arnold_toolbox:/opt/tiger/rh2:/opt/tiger/rh2:/opt/tiger/pyutil:/python:/python/lib/py4j-0.10.9-src.zip:/opt/tiger/rh2:/opt/tiger/load:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/pyutil:/opt/tiger/arnold/arnold_entrypoint:/opt/tiger/studio_loader:/usr/bin/python:. srun: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host srun: error: fetch_config: DNS SRV lookup failed srun: error: _establish_config_source: failed to fetch config srun: fatal: Could not establish a configuration source
If you don't have a slurm cluster, you can start it directly with torchrun/ddp. You only need to manually set MASTER_ADDR and MASTER_PORT.
PYTHONPATH: :/opt/tiger/arnold/arnold_entrypoint:/usr/bin/srun:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/load:/opt/tiger/studio_loader:/opt/tiger/arnold_toolbox:/opt/tiger/rh2:/opt/tiger/rh2:/opt/tiger/pyutil:/python:/python/lib/py4j-0.10.9-src.zip:/opt/tiger/rh2:/opt/tiger/load:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/pyutil:/opt/tiger/arnold/arnold_entrypoint:/opt/tiger/studio_loader which python: /usr/bin/python PYTHONPATH: :/opt/tiger/arnold/arnold_entrypoint:/usr/bin/srun:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/load:/opt/tiger/studio_loader:/opt/tiger/arnold_toolbox:/opt/tiger/rh2:/opt/tiger/rh2:/opt/tiger/pyutil:/python:/python/lib/py4j-0.10.9-src.zip:/opt/tiger/rh2:/opt/tiger/load:/opt/tiger/arnold_toolbox:/opt/tiger/api_common:/opt/tiger/pyutil:/opt/tiger/arnold/arnold_entrypoint:/opt/tiger/studio_loader:/usr/bin/python:. srun: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host srun: error: fetch_config: DNS SRV lookup failed srun: error: _establish_config_source: failed to fetch config srun: fatal: Could not establish a configuration source