wl-zhao / DiffSwap

[CVPR 2023] DiffSwap is a diffusion-based face-swapping framework.
92 stars 10 forks source link

Code doesn't work #4

Open netrunner-exe opened 10 months ago

netrunner-exe commented 10 months ago

Run python pipeline.py 1 error:

Traceback (most recent call last):
  File "/content/DiffSwap/pipeline.py", line 420, in <module>
    repair_by_mask()
  File "/content/DiffSwap/pipeline.py", line 351, in repair_by_mask
    gen_type_list = os.listdir(swap_path)
FileNotFoundError: [Errno 2] No such file or directory: 'data/portrait/swap_res'

After creating swap_res folder - 2 error:

Traceback (most recent call last):
  File "/content/DiffSwap/pipeline.py", line 421, in <module>
    paste()
  File "/content/DiffSwap/pipeline.py", line 384, in paste
    gen_type_list = os.listdir(data_root)
FileNotFoundError: [Errno 2] No such file or directory: 'data/portrait/swap_res_repair'

And finally, full log after creating swap_res_repair folder:

/content/DiffSwap
100% 1/1 [00:00<00:00,  3.63it/s]
100% 1/1 [00:00<00:00,  2.17it/s]
image: 100% 1/1 [00:00<00:00,  9.90it/s]
type source img_count 1
image: 100% 1/1 [00:00<00:00,  4.34it/s]
type target img_count 1
Recreating aligned images...
image: 100% 1/1 [00:00<00:00,  6.96it/s]
type source finished, processed 1 images
image: 100% 1/1 [00:00<00:00,  6.04it/s]
type target finished, processed 1 images
image: 100% 1/1 [00:00<00:00, 72.07it/s]
type source img_count 1
image: 100% 1/1 [00:00<00:00, 72.27it/s]
type target img_count 1
running face detection
2023-09-20 17:40:12.482367: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-20 17:40:13.527005: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-09-20 17:40:16.078296: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:16.118530: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:16.118928: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:16.119820: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:16.120106: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:16.120380: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:17.248821: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:17.249127: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:17.249384: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-09-20 17:40:17.249527: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2023-09-20 17:40:17.249570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13692 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5
  0% 0/1 [00:00<?, ?it/s]2023-09-20 17:40:18.605790: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8900
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 233ms/step
1/1 [==============================] - 0s 139ms/step
1/1 [==============================] - 0s 127ms/step
1/1 [==============================] - 0s 117ms/step
1/1 [==============================] - 0s 109ms/step
1/1 [==============================] - 0s 103ms/step
1/1 [==============================] - 0s 101ms/step
2/2 [==============================] - 0s 63ms/step
1/1 [==============================] - 0s 221ms/step
100% 1/1 [00:03<00:00,  3.65s/it]
1/1 [==============================] - 0s 17ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 17ms/step
1/1 [==============================] - 0s 17ms/step
1/1 [==============================] - 0s 17ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 17ms/step
1/1 [==============================] - 0s 77ms/step
1/1 [==============================] - 0s 18ms/step
100% 1/1 [00:00<00:00,  1.63it/s]
gpu 0 process 1 images
running mtcnn
source 0.png
target 0.png
obtain the parameters of affine transformation
100% 1/1 [00:00<00:00, 52.78it/s]
type: source, cnt: 1
100% 1/1 [00:00<00:00, 104.65it/s]
type: target, cnt: 1
len(self.src_list): 1
start batch
100% 1/1 [00:00<00:00,  4.96it/s]
shuf: write error: Broken pipe
shuf: write error
/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects `--local-rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
usage: faceswap_portrait.py
       [-h]
       [--save_img SAVE_IMG]
       [--tgt_scale TGT_SCALE]
       [--world_size WORLD_SIZE]
       [--local_rank LOCAL_RANK]
       [--dist_on_itp]
       [--dist_url DIST_URL]
       checkpoint
faceswap_portrait.py: error: unrecognized arguments: --local-rank=0
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 4853) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 196, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 192, in main
    launch(args)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 177, in launch
    run(args)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
tests/faceswap_portrait.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-09-20_17:40:32
  host      : cb24f3dd7271
  rank      : 0 (local_rank: 0)
  exitcode  : 2 (pid: 4853)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
0it [00:00, ?it/s]
0it [00:00, ?it/s]

I think we need more detailed step-by-step installation instructions or troubleshooting problems with the code. Thank you.

wl-zhao commented 10 months ago
usage: faceswap_portrait.py
       [-h]
       [--save_img SAVE_IMG]
       [--tgt_scale TGT_SCALE]
       [--world_size WORLD_SIZE]
       [--local_rank LOCAL_RANK]
       [--dist_on_itp]
       [--dist_url DIST_URL]
       checkpoint
faceswap_portrait.py: error: unrecognized arguments: --local-rank=0

It is caused by different torch version. You can change the --local_rank in faceswap_portrait.py into --local-rank

dsj320 commented 6 months ago
tests/face_swap.sh: line 2: ss: command not found
shuf: write error: Broken pipe
shuf: write error
/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
usage: faceswap_portrait.py [-h] [--save_img SAVE_IMG] [--tgt_scale TGT_SCALE] [--world_size WORLD_SIZE] [--local-rank LOCAL_RANK] [--dist_on_itp] [--dist_url DIST_URL] checkpoint
faceswap_portrait.py: error: unrecognized arguments: --local_rank=0
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 31223) of binary: /opt/conda/envs/DiffSwap/bin/python3
Traceback (most recent call last):
  File "/opt/conda/envs/DiffSwap/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/envs/DiffSwap/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/opt/conda/envs/DiffSwap/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
tests/faceswap_portrait.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-01-08_09:33:36
  host      : task-20240108094624-45015
  rank      : 0 (local_rank: 0)
  exitcode  : 2 (pid: 31223)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Traceback (most recent call last):
  File "pipeline.py", line 420, in <module>
    repair_by_mask()
  File "pipeline.py", line 351, in repair_by_mask
    gen_type_list = os.listdir(swap_path)
FileNotFoundError: [Errno 2] No such file or directory: 'data/portrait/swap_res'

Hello,author,I changed "--local_rank" to "--local-rank" but it didn't work. Do you know why? Thank you!

summertight commented 4 months ago

Set the default value as 0