Closed arthurkafer closed 3 years ago
I forgot to mention, all of the images and the annotation files can be opened fine, even those that show up errors
Hi there! What's the folder structure of your project and what is content of geral/anns/2008_003320.xml annotation?
There's nothing special on the folder structure, images are saved on base folder 'geral', imgs, anns, imgs_validation and anns_validation.
The content of the file has nothing special too,
<annotation>
<folder>VOC2012</folder>
<filename>2008_003320.jpg</filename>
<source>
<database>The VOC2008 Database</database>
<annotation>PASCAL VOC2008</annotation>
<image>flickr</image>
</source>
<size>
<width>500</width>
<height>375</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>motorbike</name>
<pose>Left</pose>
<truncated>0</truncated>
<occluded>1</occluded>
<bndbox>
<xmin>80</xmin>
<ymin>147</ymin>
<xmax>393</xmax>
<ymax>347</ymax>
</bndbox>
<difficult>0</difficult>
</object>
<object>
<name>person</name>
<pose>Left</pose>
<truncated>0</truncated>
<occluded>1</occluded>
<bndbox>
<xmin>177</xmin>
<ymin>60</ymin>
<xmax>296</xmax>
<ymax>296</ymax>
</bndbox>
<difficult>0</difficult>
</object>
<object>
<name>person</name>
<pose>Frontal</pose>
<truncated>1</truncated>
<occluded>0</occluded>
<bndbox>
<xmin>448</xmin>
<ymin>117</ymin>
<xmax>483</xmax>
<ymax>168</ymax>
</bndbox>
<difficult>0</difficult>
</object>
<object>
<name>pottedplant</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<occluded>0</occluded>
<bndbox>
<xmin>426</xmin>
<ymin>162</ymin>
<xmax>500</xmax>
<ymax>276</ymax>
</bndbox>
<difficult>0</difficult>
</object>
<object>
<name>pottedplant</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<occluded>1</occluded>
<bndbox>
<xmin>79</xmin>
<ymin>155</ymin>
<xmax>143</xmax>
<ymax>274</ymax>
</bndbox>
<difficult>0</difficult>
</object>
</annotation>
Reading the axelerate/networks/common_utils/augment.py file, I see that the problem is actually with the image file, not with the annotation file. That is weirder, because in my script I open all the files just to test if there are all annotation files for all images. I tested, and the problem seems to happen in the
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
fuction, that 90% of the images in my dataset didn't work.
Looking at the images, I saw that they actually weren't on the Google Colab ambient. I used to input them manually by my local computer, but I'll have to host it somewhere to download all of them correctly. I'll do it and close the issue if I can make it work that way.
Well, actually, if I'm not mistaken cv2.cvtColor raises this exception if the image array is empty, so no image was open. imread() on not existing image doesn't raise exception. https://docs.opencv.org/master/db/deb/tutorial_display_image.html You can see here it returns None if image cannot be read - for whatever reason. This is why I put both imread and cvtColor in the try-except block. In hindsight, I should have just check if image is None...
Nevertheless it still does seem like an image path problem. If you are not able to debug it yourself, can you share Colab notebook and dateset with me privately? E.g. DM on LinkedIn or Twitter.
Yes, you're right, just cv2.cvtColor raises the exception, I was doing the validation wrongly. But it's ok, I solved the image path and downloading problem, its training fine right now
Thanks for your time and work!
Hey, bringing this issue back again.
I've switched the way that I import my dataset, always uploading it to Google Drive and downloaded it inside Google Colab. The error that is happening is the same that happened before:
And then, I went to check my dataset if those images are really on my dataset or if they aren't. But then I realized something, that image name geral/imgs/nvcamtest_21284_s00_00000.jpg
is the old filename of the image, the new one is frontalimage37.jpg
. The annotation file is frontalimage37.xml
too, and the annotation file nvcamtest_21284 does not exist on the dataset folder.
I renamed all of the images by using os.rename(old_filename, new_filename)
, I can send you the Colab link if you could help me to debug it.
Could it be something about renaming those files?
Thanks in advance
Hi, Arthur! So, if you look inside of one of the annotation files, you'll see something similar to
<annotation verified="yes">
<folder>Mark</folder>
<filename>IMG_20191130_115225_BURST20.jpg</filename>
<path>/home/ubuntu/Documents/Mark/IMG_20191130_115225_BURST20.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>4000</width>
<height>3000</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>mark</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1105</xmin>
<ymin>574</ymin>
<xmax>3343</xmax>
<ymax>2351</ymax>
</bndbox>
</object>
</annotation>
This is standard PASCAL-VOC object annotation format - you see there is filename field here. During the training aXeleRate takes the filename and joins it with image folder path. Someone made PR to aXeleRate, that reads a path field value if it is present in the annotation. I haven't reviewed it yet.
So, to summarize, you can't just rename files, the filenames inside of annotations need to be changed too.
Hi, thanks for the response.
I'll change my dataset
Describe the bug I'm trying to train a mobilenet with approx. 10k training images that were provided by PASCAL-VOC dataset, and I parsed all images that do not have the 'person' label. Unfortunately, if I try to train it, some errors show up about the integrity of the dataset, and that he couldn't open the annotation file. It happens with many annotation files, could it be something about the quantity or the file?
Screenshots This is my code, not anything special
This is the debug window
Environment (please complete the following information):
Additional context Should I validate something more about the dataset images?