Open SXQ-STUDY opened 1 year ago
$ sh tools/dist_train.sh local_configs\bisenetv2\bisenetv2_fcn_1xb32-amp-160k_cityscapes-512x1024.py 8
NOTE: Redirects are currently not supported in Windows or MacOs.
D:\Anaconda3\envs\jbnight\lib\site-packages\torch\distributed\launch.py:180: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank
argument to be set, please
change it to read from os.environ['LOCAL_RANK']
instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn( WARNING:torch.distributed.run:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[E C:\cb\pytorch_1000000000000\work\torch\csrc\distributed\c10d\socket.cpp:860] [c10d] The client socket has timed out after 900s while trying to connect to (127.0.0.1, 29500).
Traceback (most recent call last):
File "D:\Anaconda3\envs\jbnight\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\Anaconda3\envs\jbnight\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\Anaconda3\envs\jbnight\lib\site-packages\torch\distributed\launch.py", line 195, in
I have the same question? can anyone help
Checklist
Describe the bug The following error occurred when I was using single-player multi-card training in Windows10:
Reproduction
What command or script did you run?
Did you make any modifications on the code or config? Did you understand what you have modified? No changes have been made
Environment
sys.platform: win32 Python: 3.10.8 | packaged by conda-forge | (main, Nov 24 2022, 14:07:00) [MSC v.1916 64 bit (AMD64)] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1,2,3,4,5,6,7: NVIDIA A100-SXM4-40GB CUDA_HOME: D:\Anaconda3\envs\jbnight NVCC: Cuda compilation tools, release 11.7, V11.7.99 MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.34.31942 版 GCC: n/a PyTorch: 1.13.0 PyTorch compiling details: PyTorch built with:
TorchVision: 0.14.0 OpenCV: 4.7.0 MMEngine: 0.7.0 MMSegmentation: 1.0.0rc6+478e28a