Closed unrue closed 3 months ago
Hello @unrue Actually, the image is not corrupted. The xmin which should be smaller than the image width is larger than that. Or maybe it is larger than xmax. This looks like an annotation issue in the dataset itself.
Hi, thanks for the reply. So, are you saying such error is irrelevant?
Corrupt JPEG data: 2 extraneous bytes before marker 0xd6
I'm pretty sure my dataset has right coordinates, I checked before start the training. Is it possible to understand which is the image involved?
It may need some work. I think you will need to run with --workers 0
so that it runs in the main thread. Further, you will need to add a print statement in the datasets.py
to know exactly which image has the issue.
Hello, I have just pushed an update to datasets.py
. It removes all files with invalid bounding boxes before training. I am closing the issue for now. Please re-open if needed.
I'm using such tools with a pascal voc dataset format, containint about 25000 images (20000 training, 5000 validation more or less). I'm launching the toos as:
python3.10 fasterrcnn-pytorch-training-pipeline/train.py --data my_dataset/data_configs/beni_culturali.yaml --epochs 100 --model fasterrcnn_resnet50_fpn --name my_dataset--batch 16 --disable-wandb
After some iterations, I get:
Why it detect corrupted JPEG? I used such dataset with other object detection tools and worked well. How can I understand which is the JPEG involved or how can I simply skip the corrupted JPEG? Thanks.