muslll / neosr

neosr is a framework for training real-world single-image super-resolution networks.
https://github.com/muslll/neosr
Apache License 2.0
138 stars 28 forks source link

cv2.error: Caught error in DataLoader worker process 2. #5

Closed radry closed 1 year ago

radry commented 1 year ago

Shortly after starting the training following error occurs:

2023-09-06 18:24:25,932 INFO: Start training from epoch: 0, iter: 0
Traceback (most recent call last):
  File "G:\_AI\UPSCALE\neosr\train.py", line 241, in <module>
    train_pipeline(root_path)
  File "G:\_AI\UPSCALE\neosr\train.py", line 215, in train_pipeline
    train_data = prefetcher.next()
                 ^^^^^^^^^^^^^^^^^
  File "G:\_AI\UPSCALE\neosr\neosr\data\prefetch_dataloader.py", line 97, in next
    self.preload()
  File "G:\_AI\UPSCALE\neosr\neosr\data\prefetch_dataloader.py", line 83, in preload
    self.batch = next(self.loader)  # self.batch is a dict
                 ^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 633, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1345, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1371, in _process_data
    data.reraise()
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\_utils.py", line 644, in reraise
    raise exception
cv2.error: Caught error in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\_utils\worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "G:\_AI\UPSCALE\neosr\neosr\data\paired_dataset.py", line 87, in __getitem__
    img_gt = imfrombytes(img_bytes, float32=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_AI\UPSCALE\neosr\neosr\utils\img_util.py", line 133, in imfrombytes
    img = cv2.imdecode(img_np, imread_flags[flag])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cv2.error: OpenCV(4.8.0) D:\a\opencv-python\opencv-python\opencv\modules\imgcodecs\src\loadsave.cpp:802: error: (-215:Assertion failed) !buf.empty() in function 'cv::imdecode_'

The strange thing is, seeing the last line, I don't even have that path or drive on my PC??

I followed the installation instructions, so all dependencies should be installed.

Windows 10 Python 3.11.5 Torch 2 CUDA 11.8

Config File attached. train_realesrgan.txt

muslll commented 1 year ago

Hi @radry, it looks like the script is failing to fetch the images. I've tested the config file you sent (just replaced the dataset paths to use nomos_uni lmdb) and it works correctly on my end. So I'm assuming this is either a dependency issue or incorrect dataset (or paths). Please verify if your dataset folder is accessible to the script, it might currently be denied (no privileges on your current user). If you have installed like the instructions, I'd recommend you download the nomos_uni dataset to a local user accessible path (like C:\Users\user\Downloads\nomos_uni_gt\), just to make sure there's nothing wrong with your dataset or folder access privileges.

radry commented 1 year ago

Thank you, I tried the nomos_uni dataset and it seems to run without errors.

I don't know how to fix my dataset though. I had to run it through imagemagicks mogrify because before it was giving me constant errors about incorrect ICC sRGB profile.

If I may ask you another question since you tried running my config:
When it loads the pretrained model I get a lot of warning like this (shorted section here) Is this normal?

2023-09-07 12:37:44,903 INFO: Loading esrgan model from G:\_AI\UPSCALE\Models\RealESRGAN_x4plus.pth, with param key: [params_ema].
2023-09-07 12:37:44,956 WARNING: Current net - loaded net:
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv1.bias
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv1.weight
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv2.bias
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv2.weight
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv3.bias
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv3.weight
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv4.bias
2023-09-07 12:37:44,958 WARNING:   body.0.rdb1.conv4.weight
2023-09-07 12:37:44,959 WARNING:   body.0.rdb1.conv5.bias
2023-09-07 12:37:44,959 WARNING:   body.0.rdb1.conv5.weight
2023-09-07 12:37:44,959 WARNING:   body.0.rdb2.conv1.bias
2023-09-07 12:37:44,959 WARNING:   body.0.rdb2.conv1.weight
2023-09-07 12:37:44,959 WARNING:   body.0.rdb2.conv2.bias
2023-09-07 12:37:44,960 WARNING:   body.0.rdb2.conv2.weight
2023-09-07 12:37:44,960 WARNING:   body.0.rdb2.conv3.bias
2023-09-07 12:37:44,960 WARNING:   body.0.rdb2.conv3.weight
2023-09-07 12:37:44,960 WARNING:   body.0.rdb2.conv4.bias
2023-09-07 12:37:44,961 WARNING:   body.0.rdb2.conv4.weight
2023-09-07 12:37:44,961 WARNING:   body.0.rdb2.conv5.bias
2023-09-07 12:37:44,961 WARNING:   body.0.rdb2.conv5.weight
2023-09-07 12:37:44,962 WARNING:   body.0.rdb3.conv1.bias
2023-09-07 12:37:44,962 WARNING:   body.0.rdb3.conv1.weight
2023-09-07 12:37:44,962 WARNING:   body.0.rdb3.conv2.bias
2023-09-07 12:37:44,962 WARNING:   body.0.rdb3.conv2.weight
2023-09-07 12:37:44,963 WARNING:   body.0.rdb3.conv3.bias
2023-09-07 12:37:44,963 WARNING:   body.0.rdb3.conv3.weight
2023-09-07 12:37:44,963 WARNING:   body.0.rdb3.conv4.bias
2023-09-07 12:37:44,963 WARNING:   body.0.rdb3.conv4.weight
2023-09-07 12:37:44,964 WARNING:   body.0.rdb3.conv5.bias
2023-09-07 12:37:44,964 WARNING:   body.0.rdb3.conv5.weight
muslll commented 1 year ago

I don't know how to fix my dataset though.

Make sure all images are in a format supported by opencv, such as png, jpeg, webp or tiff.

a lot of warning like this

I've just commited the fix, please update the repository :+1: