modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Apache License 2.0
1.02k stars 89 forks source link

在使用多模态的speaker-diarization报错 #77

Closed xiulianzw closed 5 months ago

xiulianzw commented 5 months ago

错误信息如下

2024-03-06 14:26:02,395 - modelscope - INFO - Loading done! Current index file version is 1.12.0, with md5 a26d8f8592752068a9876d1806203fda and a total number of 964 components indexed 2024-03-06 14:26:02,419 - modelscope - INFO - Loading done! Current index file version is 1.12.0, with md5 a26d8f8592752068a9876d1806203fda and a total number of 964 components indexed 2024-03-06 14:26:02,421 - modelscope - INFO - PyTorch version 2.0.1 Found. 2024-03-06 14:26:02,421 - modelscope - INFO - Loading ast index from /home/root/.cache/modelscope/ast_indexer 2024-03-06 14:26:02,472 - modelscope - INFO - Loading done! Current index file version is 1.12.0, with md5 a26d8f8592752068a9876d1806203fda and a total number of 964 components indexed Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] Traceback (most recent call last): File "local/extract_diar_embeddings.py", line 206, in main() File "local/extract_diar_embeddings.py", line 98, in main dist.init_process_group(backend='gloo') File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1009, in _new_process_group_helper backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout) RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1682343962757/work/third_party/gloo/gloo/transport/tcp/device.cc:209] ifa != nullptr. Unable to find interface for: [0.0.12.18] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 31135) of binary: /data1/root/miniconda3/envs/3D-Speaker/bin/python Traceback (most recent call last): File "/data1/root/miniconda3/envs/3D-Speaker/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.0.1', 'console_scripts', 'torchrun')()) File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(*args, **kwargs) File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main run(args) File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/data1/root/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

xiulianzw commented 5 months ago

已经解决了,https://discuss.pytorch.org/t/runtime-error-using-distributed-with-gloo/16579,用ifconfig查看一下,设置一下 export GLOO_SOCKET_IFNAME=ifconfig中输出的设备名称