Corrupt JPEG data: premature end of data segment

Maioy97 commented 1 year ago

I'm trying to retrain a yolo7 model with part of the SKU110K dataset I've copied around 500,250,250 images for train test and valid into a seperate folder and changed the annotations from the csv format to the yolo format needed where each images gets a txt file with it's annotations

my problem is the training script keeps giving me the corrupt jpeg data even when I tried to fix it with open cv any idea on how to fix them


Transferred 552/566 items from yolov7.pt
Scaled weight_decay = 0.0005
Optimizer groups: 95 .bias, 95 conv.weight, 98 other
train: Scanning '/content/yolov7/SKU110K_fixed/images/part_annotations_train.cache' images and labels... 499 found, 0 missing, 0 empty, 0 corrupted: 100% 499/499 [00:00<?, ?it/s]
val: Scanning '/content/yolov7/SKU110K_fixed/images/part_annotations_val.cache' images and labels... 249 found, 0 missing, 0 empty, 0 corrupted: 100% 249/249 [00:00<?, ?it/s]

autoanchor: Analyzing anchors... anchors/target = 4.41, Best Possible Recall (BPR) = 0.9995
Image sizes 640 train, 640 test
Using 2 dataloader workers
Logging results to runs/train/yolov72
Starting training for 300 epochs...

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
  0% 0/16 [00:00<?, ?it/s]Corrupt JPEG data: 305 extraneous bytes before marker 0xd9
Corrupt JPEG data: 786 extraneous bytes before marker 0xd9
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
  0% 0/16 [01:07<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 616, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 361, in train
    pred = model(imgs)  # forward
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/yolov7/models/yolo.py", line 599, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/content/yolov7/models/yolo.py", line 625, in forward_once
    x = m(x)  # run
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/yolov7/models/common.py", line 108, in forward
    return self.act(self.bn(self.conv(x)))
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 100.00 MiB (GPU 0; 14.76 GiB total capacity; 13.26 GiB already allocated; 71.75 MiB free; 13.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

JeremyDemers-Pfizer commented 1 year ago

This may help you check and fix your images: https://stackoverflow.com/questions/33548956/detect-avoid-premature-end-of-jpeg-in-cv2-python

Maioy97 commented 1 year ago

I tried this before posting, but it still gives the error even tho opencv doesn't find anymore corrupt images

freshn commented 1 year ago

Have you solved it? I encountered the same bug.

doots1802 commented 1 year ago

yee have you solved it?

WongKinYiu / yolov7

Corrupt JPEG data: premature end of data segment #1249