Open sankexin opened 3 months ago
加载自定义词典成功 加载自定义词典成功 WARNING: Logging before InitGoogleLogging() is written to STDERR I0425 17:05:53.758841 4566 ProcessGroupNCCL.cpp:686] [Rank 0] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 1800000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=187465152 Traceback (most recent call last): File "train.py", line 106, in torch.multiprocessing.spawn(train, args=(world_size,), nprocs=world_size, join=True) File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 246, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method="spawn") File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 202, in start_processes while not context.join(): File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 145, in join raise ProcessExitedException( torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGBUS
env
apex 0.1+2a4864d.abi0.dtk2310.torch2.1 torch 2.1.0a0+git793d2b5.abi0.dtk2310 torchaudio 2.1.2+4b32183.abi0.dtk2310.torch2.1.0a0 torchvision 0.16.0+git267eff6.abi0.dtk2310.torch2.1.0
mv checkpoints/vocoder.pt ./
python train.py
加载自定义词典成功 加载自定义词典成功 WARNING: Logging before InitGoogleLogging() is written to STDERR I0425 17:05:53.758841 4566 ProcessGroupNCCL.cpp:686] [Rank 0] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 1800000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=187465152 Traceback (most recent call last): File "train.py", line 106, in
torch.multiprocessing.spawn(train, args=(world_size,), nprocs=world_size, join=True)
File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 246, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method="spawn")
File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 202, in start_processes
while not context.join():
File "/usr/local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 145, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGBUS
env
pip list | grep torch
apex 0.1+2a4864d.abi0.dtk2310.torch2.1 torch 2.1.0a0+git793d2b5.abi0.dtk2310 torchaudio 2.1.2+4b32183.abi0.dtk2310.torch2.1.0a0 torchvision 0.16.0+git267eff6.abi0.dtk2310.torch2.1.0