TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
1.(arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor
以及
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
1.(arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor
以及
File "/home/wangweijun/.conda/envs/vim/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported: 1.(arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor 以及
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported: 1.(arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor 以及
File "/home/wangweijun/.conda/envs/vim/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
main.py FAILED
Failures: [1]: time : 2024-06-13_14:17:44 host : aiot-a100 rank : 2 (local_rank: 2) exitcode : 1 (pid: 1462927) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2024-06-13_14:17:44 host : aiot-a100 rank : 6 (local_rank: 6) exitcode : 1 (pid: 1462931) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure): [0]: time : 2024-06-13_14:17:44 host : aiot-a100 rank : 1 (local_rank: 1) exitcode : 1 (pid: 1462926) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html