When running the code of the module finetune_demo on the windosw11 system, an error will be reported

System Info / 系統信息

deep speed0.14.0
triton2.1.0 install torch-2.2.1+cu121-cp311-cp311-win_amd64.whl

Who can help? / 谁可以帮助到您？

finetune_demo: @1049451037

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

[2024-07-30 17:30:18,378] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs. [2024-07-30 17:30:23,857] [WARNING] No training data specified [2024-07-30 17:30:23,857] [WARNING] No train_iters (recommended) or epochs specified, use default 10k iters. [2024-07-30 17:30:23,857] [INFO] using world size: 1 and model-parallel size: 1 [2024-07-30 17:30:23,857] [INFO] > padded vocab (size: 100) with 28 dummy tokens (new size: 128) Traceback (most recent call last): File "D:\PycharmProjects\CogVLM-main\finetune_demo\finetune_cogagent_demo.py", line 260, in args = get_args(args_list) ^^^^^^^^^^^^^^^^^^^ File "D:\conda3\envs\cogvlm\Lib\site-packages\sat\arguments.py", line 442, in get_args initialize_distributed(args) File "D:\conda3\envs\cogvlm\Lib\site-packages\sat\arguments.py", line 513, in initialize_distributed torch.distributed.init_process_group( File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\c10d_logger.py", line 86, in wrapper func_return = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1184, in init_process_group defaultpg, = _new_process_group_helper( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1302, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL built in") RuntimeError: Distributed package doesn't have NCCL built in

Expected behavior / 期待表现

yes

THUDM / CogVLM