WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.31k stars 4.2k forks source link

yolo-seg: ValueError: cannot reshape array of size 0 into shape (480,640,3) #1389

Closed Robotatron closed 1 year ago

Robotatron commented 1 year ago

Running with python seg/segment/train.py --data sagemaker.yaml --cfg yolov7s-seg.yaml --hyp hyp.scratch-high.yaml --img 480 --batch -1 --project runs/SageMaker/train --name 01-S-600-hyphigh

I've opened every image with PIL and converted every image to a "RGB" format. So if an image file was corrupt, PIL would notice that.

Error:

      Epoch    GPU_mem   box_loss   seg_loss   obj_loss   cls_loss  Instances       Size
      0/299      23.8G     0.1107     0.2574    0.06515    0.05312       1813        480:   3%|▎         | 43/1585 [00:53<29:15,  1.14s/it]                                                                                         libpng warning: iCCP: known incorrect sRGB profile
      0/299      23.8G     0.1107      0.256     0.0654     0.0531       1959        480:   3%|▎         | 55/1585 [01:06<27:59,  1.10s/it]                                                                                         libpng warning: iCCP: known incorrect sRGB profile
      0/299      23.8G     0.1101      0.241     0.0653    0.05295       1962        480:   7%|▋         | 111/1585 [02:08<28:21,  1.15s/it]                                                                                        
Traceback (most recent call last):
  File "seg/segment/train.py", line 681, in <module>
    main(opt)
  File "seg/segment/train.py", line 577, in main
    train(opt.hyp, opt, device, callbacks)
  File "seg/segment/train.py", line 295, in train
    for i, (imgs, targets, paths, _, masks) in pbar:  # batch ------------------------------------------------------
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/home/jupyter/yolov7/seg/utils/dataloaders.py", line 171, in __iter__
    yield next(self.iterator)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1183, in _next_data
    return self._process_data(data)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/_utils.py", line 434, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 7.
Original Traceback (most recent call last):
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/jupyter/yolov7/seg/utils/segment/dataloaders.py", line 111, in __getitem__
    img, labels, segments = self.load_mosaic(index)
  File "/home/jupyter/yolov7/seg/utils/segment/dataloaders.py", line 217, in load_mosaic
    img, _, (h, w) = self.load_image(index)
  File "/home/jupyter/yolov7/seg/utils/dataloaders.py", line 677, in load_image
    im = np.load(fn)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/numpy/lib/npyio.py", line 432, in load
    return format.read_array(fid, allow_pickle=allow_pickle,
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/numpy/lib/format.py", line 820, in read_array
    array.shape = shape
ValueError: cannot reshape array of size 0 into shape (480,640,3)
yulin010101 commented 1 year ago

Does training work without using cache_images?

Robotatron commented 1 year ago

Does training work without using cache_images?

I did not use any caching for this run (see my command line arguments above). I found the issue though, it was a broken .npy file that was created from a previous run with --cache disk