Open wbfwonderful opened 1 year ago
This error seems to be not related to my code. It is an error about reading your training images. I haven't encountered this error before and have no idea on how to solve it.
Maybe you can search OSError: cannot identify image file
and find the solution on the Internet.
Hi, I am training my own dataset on Colab follwing the steps of Readme, but the training fails in the second step of Facial destylization : "Step 2: Fine-tune StyleGAN". The error information is as followed:
load model: ./checkpoint/stylegan2-ffhq-config-f.pt
0%| | 0/600 [00:00<?, ?it/s] Traceback (most recent call last): File "finetune_stylegan.py", line 391, in train(args, loader, generator, discriminator, g_optim, d_optim, g_ema, device) File "finetune_stylegan.py", line 115, in train real_img = next(loader) File "/content/DualStyleGAN/util.py", line 58, in sample_data for batch in loader: File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 721, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/DualStyleGAN/model/stylegan/dataset.py", line 37, in getitem img = Image.open(buffer) File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 2657, in open % (filename if filename else fp)) OSError: cannot identify image file <_io.BytesIO object at 0x7f070291c410> ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2003) of binary: /usr/bin/python3 Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in main() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run )(*cmd_args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
finetune_stylegan.py FAILED
Failures:
# Root Cause (first observed failure): [0]: time : 2022-11-23_07:27:01 host : a3b13d7b3fb3 rank : 0 (local_rank: 0) exitcode : 1 (pid: 2003) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
I guess your training pictures should be placed in the images/train/directory @wbfwonderful
Hi, I am training my own dataset on Colab follwing the steps of Readme, but the training fails in the second step of Facial destylization : "Step 2: Fine-tune StyleGAN". The error information is as followed:
load model: ./checkpoint/stylegan2-ffhq-config-f.pt 0%| | 0/600 [00:00<?, ?it/s] Traceback (most recent call last): File "finetune_stylegan.py", line 391, in
train(args, loader, generator, discriminator, g_optim, d_optim, g_ema, device)
File "finetune_stylegan.py", line 115, in train
real_img = next(loader)
File "/content/DualStyleGAN/util.py", line 58, in sample_data
for batch in loader:
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 721, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/DualStyleGAN/model/stylegan/dataset.py", line 37, in getitem
img = Image.open(buffer)
File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 2657, in open
% (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f070291c410>
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2003) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in
main()
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run
)(*cmd_args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
finetune_stylegan.py FAILED
Failures: