kentaroy47 / frcnn-from-scratch-with-keras

:collision:Faster R-CNN from scratch written with Keras
Apache License 2.0
168 stars 107 forks source link

Dataset format #8

Open NoraKrbr opened 5 years ago

NoraKrbr commented 5 years ago


could you tell me about how the data format and annotation format? I downloaded the VOC2007 dataset to try out your code and am not sure how I should structure the data in directories and how to deal with the xml annotations. Thanks for your work and help in advance.


chamecall commented 5 years ago
kentaroy47 commented 5 years ago

@NoraKrbr I guess you successfully trained the VOC dataset.

To set up your custom dataset, you can either 1) make a simple dataset format like

/data/imgs/img_001.jpg,837,346,981,456,cow /data/imgs/img_002.jpg,215,312,279,391,cat

this is handy but cant be scaled.

2) use labeling tools labelme is a good one which I use.

labelme outputs have to be converted to VOC format, you can use this script.

kentaroy47 commented 5 years ago

@chamecall No, you have to select them as the testing set. If VOC, the test images should be held in ImageSets/Main/test.txt

The VOC mirror of pjreddie may not include the test images. You can manually split the train sets to half and make a test set if you want..

chamecall commented 5 years ago

@chamecall No, you have to select them as the testing set. If VOC, the test images should be held in ImageSets/Main/test.txt

The VOC mirror of pjreddie may not include the test images. You can manually split the train sets to half and make a test set if you want..

thanks for quick reply.

NoraKrbr commented 5 years ago

Thank you for your answer. I am still new in object detection, so some things are still unclear to me. My goal will eventually be to train on the COCO data set. Do I have to write a script that converts COCO annotations to VOC format or do other formats also work?

kentaroy47 commented 5 years ago

@NoraKrbr oh I see. There is a official COCO dataloader you can use for training. you should use that.

COCO takes a week to train, so it is always good to practice with VOC first.

chamecall commented 5 years ago

You can manually split the train sets to half and make a test set if you want..

You said "if you want". Isn't it a mandatory operation? How we can we evaluate accuracy on the new images and check whether network is overtraining or it isn't?

NoraKrbr commented 5 years ago

@chamecall You can also download the test data from

NoraKrbr commented 5 years ago

@NoraKrbr oh I see. There is a official COCO dataloader you can use for training. you should use that.

COCO takes a week to train, so it is always good to practice with VOC first.

Thank you, I will try to integrate the model trained on VOC in my application first and then maybe get back to you at a later point. Thanks for your help so far.

chamecall commented 5 years ago

@chamecall You can also download the test data from

thanks but I want to train net on the custom data so I'd like to find out how I can estimate net accuracy without test data.

mgbvox commented 4 years ago

@kentaroy47 re: the dataset sample format below: /data/imgs/img_001.jpg,837,346,981,456,cow

Is that following the following convention? 'path, xmin, ymin, xmax, ymax, class'


'path, xmin, xmax, ymin, ymax, class'?

GMahdi commented 4 years ago

Hi, Thank you for the great work. I am quite new to object detection and your work has helped me a lot in developing my understanding. I have downloaded VOC dataset and trained as explained but how to test on it and is there any way to get mAP or recall metrics for VOC dataset. Thanks

magloiretouh commented 4 years ago

You can use this code to convert VOC xml annotations files to a single Dataset.txt file

import os
from bs4 import BeautifulSoup

path = './Annotations/'

files = []
datasetFile = open("./Dataset.txt","w")
filepath=x1=y1=x2=y2=class_name= ""
for r, d, f in os.walk(path):
    for file in f:
        files.append(os.path.join(path, file))

for f in files:
    with open(f, "r") as file:
        str = file.readlines()
        str = "".join(str)
        soup = BeautifulSoup(str, features="html.parser")
        filepath = "./JPEGImages/"+soup.find('filename').string;
        for object in soup.find_all('object'):
            x1 = object.find('bndbox').find('xmin').string
            y1 = object.find('bndbox').find('ymin').string
            x2 = object.find('bndbox').find('xmax').string
            y2 = object.find('bndbox').find('ymax').string
            class_name = object.find('name').string

kentaroy47 commented 4 years ago

@mgbvox it is: path, xmin, ymin, xmax, ymax, class this applies to all object detection formats.

Shilpa141 commented 2 years ago

How to create text file if one image has more then one bounding box... it will be in single line or newline..?????????