chengzeyi / piflux

(WIP) Parallel inference for black-forest-labs' FLUX model.
Other
11 stars 1 forks source link

RuntimeError: value cannot be converted to type int64_t without overflow #2

Closed feifeibear closed 1 week ago

feifeibear commented 1 week ago

rank0: Traceback (most recent call last): rank0: File "/home/reaper/xdit_test/piflux/examples/run_flux.py", line 370, in

rank0: File "/home/reaper/xdit_test/piflux/examples/run_flux.py", line 303, in main

rank0: File "/home/reaper/xdit_test/piflux/src/piflux/adapters/diffusers.py", line 84, in new_call rank0: seed_t = torch.full([1], seed, dtype=torch.int64) rank0: RuntimeError: value cannot be converted to type int64_t without overflow rank0:[W1115 09:08:20.026822356 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) rank1: Traceback (most recent call last): rank1: File "/home/reaper/xdit_test/piflux/examples/run_flux.py", line 370, in

rank1: File "/home/reaper/xdit_test/piflux/examples/run_flux.py", line 303, in main

rank1: File "/home/reaper/xdit_test/piflux/src/piflux/adapters/diffusers.py", line 85, in new_call rank1: seed_t = piflux_ops.get_complete_tensor(seed_t, dim=0) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_ops.py", line 1116, in call rank1: return self._op(*args, (kwargs or {})) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_library/autograd.py", line 113, in autograd_impl rank1: result = forward_no_grad(args, Metadata(keyset, keyword_only_args)) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_library/autograd.py", line 40, in forward_no_grad rank1: result = op.redispatch(keyset & _C._after_autograd_keyset, args, kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_ops.py", line 721, in redispatch rank1: return self._handle.redispatch_boxed(keyset, *args, kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_library/custom_ops.py", line 324, in backend_impl rank1: result = self._backend_fns[device_type](*args, *kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner rank1: return disable_fn(args, kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn rank1: return fn(*args, kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/_library/custom_ops.py", line 367, in wrapped_fn rank1: return fn(*args, *kwargs) rank1: File "/home/reaper/xdit_test/piflux/src/piflux/ops/context_ops.py", line 80, in get_complete_tensor rank1: dist.all_gather(gathered_tensors, tensor.contiguous()) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 83, in wrapper rank1: return func(args, kwargs) rank1: File "/home/reaper/miniconda3/envs/fjr/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 3346, in all_gather

chengzeyi commented 1 week ago

fixed in the latest version