Hi i got a lot of errors when trying to run the following sample:
sh scripts/run_car.sh
Can you help me? I spent the whole weekend trying to run your project, but until this moment I got no luck.
This is the response I get. Thanks!
/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/launch.py:163: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(
The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run
WARNING:torch.distributed.run:--use_env is deprecated and will be removed in future releases.
Please read local_rank from `os.environ('LOCAL_RANK')` instead.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs:
entrypoint : run.py
min_nodes : 1
max_nodes : 1
nproc_per_node : 4
run_id : none
rdzv_backend : static
rdzv_endpoint : 127.0.0.1:29578
rdzv_configs : {'rank': 0, 'timeout': 900}
max_restarts : 3
monitor_interval : 5
log_dir : None
metrics_cfg : {}
INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_5326gy0x/none_sdkujdk4
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/utils/store.py:52: FutureWarning: This is an experimental API and will be changed in future.
warnings.warn(
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=0
master_addr=127.0.0.1
master_port=29578
group_rank=0
group_world_size=1
local_ranks=[0, 1, 2, 3]
role_ranks=[0, 1, 2, 3]
global_ranks=[0, 1, 2, 3]
role_world_sizes=[4, 4, 4, 4]
global_world_sizes=[4, 4, 4, 4]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_0/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_0/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_0/3/error.json
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
Traceback (most recent call last):
from skimage.measure import compare_ssim
File "run.py", line 7, in <module>
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 17154) of binary: /home/darkayserleo/anaconda3/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 3/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=1
master_addr=127.0.0.1
master_port=29578
group_rank=0
group_world_size=1
local_ranks=[0, 1, 2, 3]
role_ranks=[0, 1, 2, 3]
global_ranks=[0, 1, 2, 3]
role_world_sizes=[4, 4, 4, 4]
global_world_sizes=[4, 4, 4, 4]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_1/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_1/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_1/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_1/3/error.json
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 17187) of binary: /home/darkayserleo/anaconda3/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 2/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=2
master_addr=127.0.0.1
master_port=29578
group_rank=0
group_world_size=1
local_ranks=[0, 1, 2, 3]
role_ranks=[0, 1, 2, 3]
global_ranks=[0, 1, 2, 3]
role_world_sizes=[4, 4, 4, 4]
global_world_sizes=[4, 4, 4, 4]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_2/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_2/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_2/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_2/3/error.json
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
Traceback (most recent call last):
File "run.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 17220) of binary: /home/darkayserleo/anaconda3/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 1/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=3
master_addr=127.0.0.1
master_port=29578
group_rank=0
group_world_size=1
local_ranks=[0, 1, 2, 3]
role_ranks=[0, 1, 2, 3]
global_ranks=[0, 1, 2, 3]
role_world_sizes=[4, 4, 4, 4]
global_world_sizes=[4, 4, 4, 4]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_3/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_3/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_3/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_5326gy0x/none_sdkujdk4/attempt_3/3/error.json
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
Traceback (most recent call last):
File "run.py", line 7, in <module>
from gan2shape import setup_runtime, Trainer, GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/__init__.py", line 3, in <module>
from .model import GAN2Shape
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/model.py", line 16, in <module>
from .stylegan2 import Generator, Discriminator, PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/__init__.py", line 15, in <module>
from lpips import PerceptualLoss
File "/home/darkayserleo/Documentos/Tesis/GAN2Shape/gan2shape/stylegan2/stylegan2-pytorch/lpips/__init__.py", line 7, in <module>
from skimage.measure import compare_ssim
ImportError: cannot import name 'compare_ssim' from 'skimage.measure' (/home/darkayserleo/anaconda3/lib/python3.8/site-packages/skimage/measure/__init__.py)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 17253) of binary: /home/darkayserleo/anaconda3/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:Local worker group finished (FAILED). Waiting 300 seconds for other agents to finish
/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/utils/store.py:70: FutureWarning: This is an experimental API and will be changed in future.
warnings.warn(
INFO:torch.distributed.elastic.agent.server.api:Done waiting for other agents. Elapsed: 0.0003464221954345703 seconds
{"name": "torchelastic.worker.status.FAILED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 0, "group_rank": 0, "worker_id": "17253", "role": "default", "hostname": "DarKayserLeo", "state": "FAILED", "total_run_time": 20, "rdzv_backend": "static", "raw_error": "{\"message\": \"<NONE>\"}", "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [0], \"role_rank\": [0], \"role_world_size\": [4]}", "agent_restarts": 3}}
{"name": "torchelastic.worker.status.FAILED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 1, "group_rank": 0, "worker_id": "17254", "role": "default", "hostname": "DarKayserLeo", "state": "FAILED", "total_run_time": 20, "rdzv_backend": "static", "raw_error": "{\"message\": \"<NONE>\"}", "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [1], \"role_rank\": [1], \"role_world_size\": [4]}", "agent_restarts": 3}}
{"name": "torchelastic.worker.status.FAILED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 2, "group_rank": 0, "worker_id": "17255", "role": "default", "hostname": "DarKayserLeo", "state": "FAILED", "total_run_time": 20, "rdzv_backend": "static", "raw_error": "{\"message\": \"<NONE>\"}", "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [2], \"role_rank\": [2], \"role_world_size\": [4]}", "agent_restarts": 3}}
{"name": "torchelastic.worker.status.FAILED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 3, "group_rank": 0, "worker_id": "17256", "role": "default", "hostname": "DarKayserLeo", "state": "FAILED", "total_run_time": 20, "rdzv_backend": "static", "raw_error": "{\"message\": \"<NONE>\"}", "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [3], \"role_rank\": [3], \"role_world_size\": [4]}", "agent_restarts": 3}}
{"name": "torchelastic.worker.status.SUCCEEDED", "source": "AGENT", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": null, "group_rank": 0, "worker_id": null, "role": "default", "hostname": "DarKayserLeo", "state": "SUCCEEDED", "total_run_time": 20, "rdzv_backend": "static", "raw_error": null, "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\"}", "agent_restarts": 3}}
/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py:354: UserWarning:
**********************************************************************
CHILD PROCESS FAILED WITH NO ERROR_FILE
**********************************************************************
CHILD PROCESS FAILED WITH NO ERROR_FILE
Child process 17253 (local_rank 0) FAILED (exitcode 1)
Error msg: Process failed with exitcode 1
Without writing an error file to <N/A>.
While this DOES NOT affect the correctness of your application,
no trace information about the error will be available for inspection.
Consider decorating your top level entrypoint function with
torch.distributed.elastic.multiprocessing.errors.record. Example:
from torch.distributed.elastic.multiprocessing.errors import record
@record
def trainer_main(args):
# do train
**********************************************************************
warnings.warn(_no_error_file_warning_msg(rank, failure))
Traceback (most recent call last):
File "/home/darkayserleo/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/darkayserleo/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 173, in <module>
main()
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 169, in main
run(args)
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 621, in run
elastic_launch(
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
return f(*args, **kwargs)
File "/home/darkayserleo/anaconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
***************************************
run.py FAILED
=======================================
Root Cause:
[0]:
time: 2021-07-25_17:09:38
rank: 0 (local_rank: 0)
exitcode: 1 (pid: 17253)
error_file: <N/A>
msg: "Process failed with exitcode 1"
=======================================
Other Failures:
[1]:
time: 2021-07-25_17:09:38
rank: 1 (local_rank: 1)
exitcode: 1 (pid: 17254)
error_file: <N/A>
msg: "Process failed with exitcode 1"
[2]:
time: 2021-07-25_17:09:38
rank: 2 (local_rank: 2)
exitcode: 1 (pid: 17255)
error_file: <N/A>
msg: "Process failed with exitcode 1"
[3]:
time: 2021-07-25_17:09:38
rank: 3 (local_rank: 3)
exitcode: 1 (pid: 17256)
error_file: <N/A>
msg: "Process failed with exitcode 1"
***************************************
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
Hi i got a lot of errors when trying to run the following sample:
sh scripts/run_car.sh
Can you help me? I spent the whole weekend trying to run your project, but until this moment I got no luck.
This is the response I get. Thanks!