Open FlorianRegisBamb opened 2 years ago
It's hard to know. Maybe this image is corrupt, or this image is too small (smaller than your crop size). Which --preprocess
flag did you use?
I am using "resize_and_crop" as my --preprocess flag. What makes me confused is the fact that it was running until this epoch(148) and this error came from my input wich did not changes during the train processing. Then when I continue the training with --continue_train it works sometimes and the other time the same error come after some epochs
Sometimes, your cropping might get unlucky. It not only depends on the image sizes, but also depends on where you crop the patches.
Hello, I was training my model it was working until epoch 148 when I got theses Errors: <<OSError: Caught OSError in DataLoader worker process 3>> <<OSError: [Errno 5] Input/output error>>. I'm training the model on a linux VM.
learning rate 0.0001050 -> 0.0001030 (epoch: 148, iters: 50, time: 5.328, data: 0.004) G_GAN: 1.660 G_L1: 21.545 D_real: 0.006 D_fake: 0.244 G: 23.206 D: 0.125 saving the latest model (epoch 148, total_iters 60000) (epoch: 148, iters: 150, time: 1.322, data: 0.003) G_GAN: 1.076 G_L1: 34.955 D_real: 0.000 D_fake: 0.642 G: 36.031 D: 0.321 (epoch: 148, iters: 250, time: 1.316, data: 0.004) G_GAN: 2.841 G_L1: 17.667 D_real: 0.607 D_fake: 0.061 G: 20.508 D: 0.334 (epoch: 148, iters: 350, time: 1.338, data: 0.004) G_GAN: 1.837 G_L1: 25.288 D_real: 0.050 D_fake: 0.239 G: 27.126 D: 0.144 (epoch: 148, iters: 450, time: 2.624, data: 0.003) G_GAN: 5.915 G_L1: 23.653 D_real: 0.006 D_fake: 0.003 G: 29.568 D: 0.005 (epoch: 148, iters: 550, time: 1.307, data: 0.004) G_GAN: 1.869 G_L1: 35.894 D_real: 0.004 D_fake: 0.292 G: 37.763 D: 0.148 (epoch: 148, iters: 650, time: 1.308, data: 0.003) G_GAN: 1.511 G_L1: 21.548 D_real: 0.095 D_fake: 0.382 G: 23.059 D: 0.238 (epoch: 148, iters: 750, time: 1.338, data: 0.003) G_GAN: 3.447 G_L1: 22.605 D_real: 0.088 D_fake: 0.038 G: 26.052 D: 0.063 (epoch: 148, iters: 850, time: 2.473, data: 0.004) G_GAN: 3.026 G_L1: 22.714 D_real: 0.017 D_fake: 0.063 G: 25.740 D: 0.040
Traceback (most recent call last): File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/train.py", line 44, in
for i, data in enumerate(dataset): # inner loop within one epoch
File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/data/init.py", line 90, in iter
for i, data in enumerate(self.dataloader):
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/home/exxact/.local/lib/python3.10/site-packages/torch/_utils.py", line 461, in reraise
raise exception
OSError: Caught OSError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/data/aligned_dataset.py", line 45, in getitem
A = AB.crop((0, 0, w2, h))
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1146, in crop
self.load()
File "/usr/lib/python3/dist-packages/PIL/ImageFile.py", line 235, in load
s = read(self.decodermaxblock)
File "/usr/lib/python3/dist-packages/PIL/JpegImagePlugin.py", line 402, in load_read
s = self.fp.read(read_bytes)
OSError: [Errno 5] Input/output error
Traceback (most recent call last):
File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/train.py", line 44, in
for i, data in enumerate(dataset): # inner loop within one epoch
File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/data/init.py", line 90, in iter
for i, data in enumerate(self.dataloader):
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/home/exxact/.local/lib/python3.10/site-packages/torch/_utils.py", line 461, in reraise
raise exception
OSError: Caught OSError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/exxact/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/exxact/Documents/OMEGA/OMEGA_RD_IA/CycleGAN_Pix2Pix/data/aligned_dataset.py", line 45, in getitem
A = AB.crop((0, 0, w2, h))
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1146, in crop
May I ask help to understand where this come from?