rbgirshick / py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
Other
8.1k stars 4.11k forks source link

Training faster rcnn with own dataset #463

Closed June-Jo closed 7 years ago

June-Jo commented 7 years ago

I'm trying to train the faster-rcnn using my own dataset.

Now when I command

irobot@irobot:~/py-faster-rcnn$ ./experiments/scripts/faster_rcnn_end2end_june.sh 1 VGG16

I got error below.

Traceback (most recent call last):
  File "./tools/train_net.py", line 104, in <module>
    imdb, roidb = combined_roidb(args.imdb_name)
  File "./tools/train_net.py", line 69, in combined_roidb
    roidbs = [get_roidb(s) for s in imdb_names.split('+')]
  File "./tools/train_net.py", line 66, in get_roidb
    roidb = get_training_roidb(imdb)
  File "/home/irobot/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 118, in get_training_roidb
    imdb.append_flipped_images()
  File "/home/irobot/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 106, in append_flipped_images
    boxes = self.roidb[i]['boxes'].copy()
  File "/home/irobot/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 67, in roidb
    self._roidb = self.roidb_handler()
  File "/home/irobot/py-faster-rcnn/tools/../lib/datasets/june.py", line 90, in gt_roidb
    for index in self.image_index]
  File "/home/irobot/py-faster-rcnn/tools/../lib/datasets/june.py", line 205, in _load_june_annotation
    overlaps[ix, cls] = 1.0
IndexError: index 12 is out of bounds for axis 1 with size 7

I changed a lot of things to train faster-rcnn, so let me show you how I made the dataset and what I changed.

Datasets

/py-faster-rcnn/data/JUNE_devkit

JUNE_devkit has 3 folders like below.

Annotations folder has the text files such as 00000014.txt. The contents of these files are like '1 330 160 540 520' where 1 is class, 330 & 160 is the left top of the ROI, and 540 & 520 is right bottom of the ROI.

Images folder has the pictures named as 00000014.bmp. The size of the pictures are same of 1280x1024.

ImageSets folder has 2 text files, train.txt and test.txt. The contents of these are the image names without extensions of traing set and test set, for example, 0000014.

Changed codes

/py-faster-rcnn/experiments/scripts

In this folder, I made a file 'faster_rcnn_end2end_june.sh' refering the 'faster_rcnn_end2end.sh' and 'faster_rcnn_end2end_imagenet.sh'.

faster_rcnn_end2end_june.sh

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
NET=$2
NET_lc=${NET,,}
ITERS=100000
DATASET_TRAIN="june_train"
DATASET_TEST="june_test"

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:2:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

LOG="experiments/logs/faster_rcnn_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

NET_INIT=data/imagenet_models/${NET}.v2.caffemodel

time ./tools/train_net.py --gpu ${GPU_ID} \
  --solver models/${NET}/faster_rcnn_end2end/solver.prototxt \
  --weights ${NET_INIT} \
  --imdb ${DATASET_TRAIN} \
  --iters ${ITERS} \
  --cfg experiments/cfgs/faster_rcnn_end2end.yml \
  ${EXTRA_ARGS}

set +x
NET_FINAL=`grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'`
set -x

time ./tools/test_net.py --gpu ${GPU_ID} \
  --def models/${NET}/faster_rcnn_end2end/test.prototxt \
  --net ${NET_FINAL} \
  --imdb ${DATASET_TEST} \
  --cfg experiments/cfgs/faster_rcnn_end2end.yml \
  ${EXTRA_ARGS}

/py-faster-rcnn/lib/datasets

At the factory.py, I added codes below.

factory.py

from datasets.june import june
june_devkit_path = '/home/irobot/py-faster-rcnn/data/JUNE_devkit'
for split in ['train', 'test']:
    name = '{}_{}'.format('june', split)
    __sets[name] = (lambda split=split: june(split, june_devkit_path))

june.py is copied from pascal_voc.py and changed a little bit.

june.py

import datasets
import datasets.june
import os
from datasets.imdb import imdb
import xml.dom.minidom as minidom
import numpy as np
import scipy.sparse
import scipy.io as sio
import utils.cython_bbox
import cPickle
import subprocess

class june(imdb):
    def __init__(self, image_set, devkit_path):
        imdb.__init__(self, image_set)
        self._image_set = image_set
        self._devkit_path = devkit_path
        self._data_path = os.path.join(self._devkit_path, 'data')
        self._classes = ('__background__', # always index 0
                         'Bottle', 'Cell phone', 'Gear', 'Glue', 'Pen', 'Piston shaft')
        self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
        self._image_ext = '.bmp'
        self._image_index = self._load_image_set_index()
        # Default to roidb handler
        self._roidb_handler = self.selective_search_roidb

        # Specific config options
        self.config = {'cleanup'  : True,
                       'use_salt' : True,
                       'top_k'    : 2000}

        assert os.path.exists(self._devkit_path), \
                'Devkit path does not exist: {}'.format(self._devkit_path)
        assert os.path.exists(self._data_path), \
                'Path does not exist: {}'.format(self._data_path)

    def image_path_at(self, i):
        """
        Return the absolute path to image i in the image sequence.
        """
        return self.image_path_from_index(self._image_index[i])

    def image_path_from_index(self, index):
        """
        Construct an image path from the image's "index" identifier.
        """
        #for ext in self._image_ext:
        image_path = os.path.join(self._data_path, 'Images',
                                  index + self._image_ext)
        #    if os.path.exists(image_path):
        #        break
        assert os.path.exists(image_path), \
                'Path does not exist: {}'.format(image_path)
    return image_path

    def _load_image_set_index(self):
        """
        Load the indexes listed in this dataset's image set file.
        """
        # Example path to image set file:
        # self._data_path + /ImageSets/val.txt
        image_set_file = os.path.join(self._data_path, 'ImageSets', 
                                      self._image_set + '.txt')
        assert os.path.exists(image_set_file), \
                'Path does not exist: {}'.format(image_set_file)
        with open(image_set_file) as f:
            image_index = [x.strip() for x in f.readlines()]
        return image_index

    def gt_roidb(self):
        """
        Return the database of ground-truth regions of interest.
        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} gt roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = [self._load_june_annotation(index)
                    for index in self.image_index]
        with open(cache_file, 'wb') as fid:
            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote gt roidb to {}'.format(cache_file)

        return gt_roidb

    def selective_search_roidb(self):
        """
        Return the database of selective search regions of interest.
        Ground-truth ROIs are also included.
        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                                  self.name + '_selective_search_roidb.pkl')

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        if self._image_set != 'test':
            gt_roidb = self.gt_roidb()
            ss_roidb = self._load_selective_search_roidb(gt_roidb)
            roidb = imdb.merge_roidbs(gt_roidb, ss_roidb)
        else:
            roidb = self._load_selective_search_roidb(None)
            print len(roidb)
    with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_roidb(self, gt_roidb):
        filename = os.path.abspath(os.path.join(self._devkit_path,
                                                self.name + '.mat'))
        assert os.path.exists(filename), \
               'Selective search data not found at: {}'.format(filename)
    raw_data = sio.loadmat(filename)['all_boxes'].ravel()

        box_list = []
        for i in xrange(raw_data.shape[0]):
            box_list.append(raw_data[i][:, (1, 0, 3, 2)] - 1)

    return self.create_roidb_from_box_list(box_list, gt_roidb)

    def selective_search_IJCV_roidb(self):
        """
        eturn the database of selective search regions of interest.
        Ground-truth ROIs are also included.
        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                '{:s}_selective_search_IJCV_top_{:d}_roidb.pkl'.
                format(self.name, self.config['top_k']))

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = self.gt_roidb()
        ss_roidb = self._load_selective_search_IJCV_roidb(gt_roidb)
        roidb = imdb.merge_roidbs(gt_roidb, ss_roidb)
        with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_IJCV_roidb(self, gt_roidb):
        IJCV_path = os.path.abspath(os.path.join(self.cache_path, '..',
                                                 'selective_search_IJCV_data',
                                                 self.name))
        assert os.path.exists(IJCV_path), \
               'Selective search IJCV data not found at: {}'.format(IJCV_path)

        top_k = self.config['top_k']
        box_list = []
        for i in xrange(self.num_images):
            filename = os.path.join(IJCV_path, self.image_index[i] + '.mat')
            raw_data = sio.loadmat(filename)
            box_list.append((raw_data['boxes'][:top_k, :]-1).astype(np.uint16))

        return self.create_roidb_from_box_list(box_list, gt_roidb)

    def _load_june_annotation(self, index):
        """
        Load image and bounding boxes info from txt files.
        """
        filename = os.path.join(self._data_path, 'Annotations', index + '.txt')
        # print 'Loading: {}'.format(filename)
        # Parse groundtruth file
    with open(filename) as f:
        data = f.readlines() 

    num_objs = len(data)
        boxes = np.zeros((num_objs, 4), dtype=np.uint16)
        gt_classes = np.zeros((num_objs), dtype=np.int32)
        overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
    # Load object bounding boxes into a data frame.
    for ix, aline in enumerate(data):
        tokens = aline.strip().split()   
        if len(tokens) != 5:
        continue
        cls = float(tokens[0]) + 1    # this file uses 0 as the background
        x1 = float(tokens[1])
        y1 = float(tokens[2])
        x2 = float(tokens[3])
        y2 = float(tokens[4])
        gt_classes[ix] = cls        
        boxes[ix, :] = [x1, y1, x2, y2]
        overlaps[ix, cls] = 1.0

        overlaps = scipy.sparse.csr_matrix(overlaps)

        return {'boxes' : boxes,
                'gt_classes': gt_classes,
                'gt_overlaps' : overlaps,
                'flipped' : False}

    def _write_june_results_file(self, all_boxes):
        use_salt = self.config['use_salt']
        comp_id = 'comp4'
        if use_salt:
            comp_id += '-{}'.format(os.getpid())

        # VOCdevkit/results/comp4-44503_det_test_aeroplane.txt
        path = os.path.join(self._devkit_path, 'results', self.name, comp_id + '_')
        for cls_ind, cls in enumerate(self.classes):
            if cls == '__background__':
                continue
            print 'Writing {} results file'.format(cls)
            filename = path + 'det_' + self._image_set + '_' + cls + '.txt'
            with open(filename, 'wt') as f:
                for im_ind, index in enumerate(self.image_index):
                    dets = all_boxes[cls_ind][im_ind]
                    if dets == []:
                        continue
                    # the VOCdevkit expects 1-based indices
                    for k in xrange(dets.shape[0]):
                        f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
                                format(index, dets[k, -1],
                                       dets[k, 0] + 1, dets[k, 1] + 1,
                                       dets[k, 2] + 1, dets[k, 3] + 1))
        return comp_id

    def _do_matlab_eval(self, comp_id, output_dir='output'):
        rm_results = self.config['cleanup']

        path = os.path.join(os.path.dirname(__file__),
                            'VOCdevkit-matlab-wrapper')
        cmd = 'cd {} && '.format(path)
        cmd += '{:s} -nodisplay -nodesktop '.format(datasets.MATLAB)
        cmd += '-r "dbstop if error; '
        cmd += 'setenv(\'LC_ALL\',\'C\'); voc_eval(\'{:s}\',\'{:s}\',\'{:s}\',\'{:s}\',{:d}); quit;"' \
               .format(self._devkit_path, comp_id,
                       self._image_set, output_dir, int(rm_results))
        print('Running:\n{}'.format(cmd))
        status = subprocess.call(cmd, shell=True)

    def evaluate_detections(self, all_boxes, output_dir):
        comp_id = self._write_june_results_file(all_boxes)
        #self._do_matlab_eval(comp_id, output_dir)

    def competition_mode(self, on):
        if on:
            self.config['use_salt'] = False
            self.config['cleanup'] = False
        else:
            self.config['use_salt'] = True
            self.config['cleanup'] = True

if __name__ == '__main__':
    d = datasets.june('train', '')
    res = d.roidb
    from IPython import embed; embed()

/py-faster-rcnn/model/VGG16/faster_rcnn_end2end

I copied VGG16 from pascal_voc. Here, test.prototxt and train.prototxt files have been changed. I'm using 7 classes(6 classes + 1 background class), so the num_classes would be 7 and the num_ouput would be 28 in the both prototxt files.

These tasks are done referring here. I also tried to remove the cache folder and the output folder, it doesn't work. I need your help, please. I'm sorry to write a very long post, but I wish this problem be solved A.S.A.P..

June-Jo commented 7 years ago

Hello, I solved the above problem. In the june.py, there is a _load_june_annotation function. In that function,

cls = float(tokens[0]) + 1

should be changed to

cls = float(tokens[0])

because my dataset is started from 1(bottle) and 0 is the background. I don't need to plus 1 to cls.

But now I have another problem.

When I command

./experiments/scripts/faster_rcnn_end2end_june.sh 1 VGG16

I get the error below.

Traceback (most recent call last):
  File "./tools/train_net.py", line 104, in <module>
    imdb, roidb = combined_roidb(args.imdb_name)
  File "./tools/train_net.py", line 69, in combined_roidb
    roidbs = [get_roidb(s) for s in imdb_names.split('+')]
  File "./tools/train_net.py", line 66, in get_roidb
    roidb = get_training_roidb(imdb)
  File "/home/irobot/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 122, in get_training_roidb
    rdl_roidb.prepare_roidb(imdb)
  File "/home/irobot/py-faster-rcnn/tools/../lib/roi_data_layer/roidb.py", line 27, in prepare_roidb
    roidb[i]['image'] = imdb.image_path_at(i)
IndexError: list index out of range

Even though I delete the folder py-faster-rcnn/data/cache, the error doesn't disappear. Is there any solution?

June-Jo commented 7 years ago

I resolved this problem by turning off the flipping function. But now I have another problem about the floating point exception.

./experiments/scripts/faster_rcnn_end2end_june.sh: line 40: 11438 Floating point exception(core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/faster_rcnn_end2end/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}

I tried to lower the learning rate, change RND_SEED, and filter_roidb but nothing can solve this problem.

I printed out the dw and dh, but there is not a problem. the values were about -1~4. I can not figure out what cause this problem in my case. Is there any idea?

dantp-ai commented 7 years ago

@HyunJun-Jo Flipping function should not be turned off. Instead figure out if your images are 0-indexed or 1-indexed and based on that you need to make some changes in june.py and june_eval.py.

June-Jo commented 7 years ago

@plopd Thanks for your advice.

Okay, I will turn on the flipping function. Then, do I need some flipped images?

My dataset have 6 classes(bottle(1), cell phone(2), gear(3), ... , piston(6)) and 1 more class(background(0), so I think cls = float(tokens[0]) is right? How do you think about it?

Is there any idea about the floating point exception error?

June-Jo commented 7 years ago

When I turn on the flipping function, the index problem has come up. My training set has 880 images now, but the maximum of the indices is 1760 which is double of 880. How can I use the flipping function?

Turning off the flipping function, the floating point exception error still come up. But I found that there is a problem of loss values.

I0113 15:01:54.791416  4315 solver.cpp:229] Iteration 0, loss = 3.01861
I0113 15:01:54.791440  4315 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0113 15:01:54.791443  4315 solver.cpp:245]     Train net output #1: loss_cls = 1.12346 (* 1 = 1.12346 loss)
I0113 15:01:54.791462  4315 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.795926 (* 1 = 0.795926 loss)
I0113 15:01:54.791467  4315 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0 (* 1 = 0 loss)
I0113 15:01:54.791484  4315 sgd_solver.cpp:106] Iteration 0, lr = 0.0001
I0113 15:02:01.060256  4315 solver.cpp:229] Iteration 20, loss = 0.00354396
I0113 15:02:01.060279  4315 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:01.060286  4315 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I0113 15:02:01.060292  4315 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.00359094 (* 1 = 0.00359094 loss)
I0113 15:02:01.060297  4315 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:01.060317  4315 sgd_solver.cpp:106] Iteration 20, lr = 0.0001
I0113 15:02:07.125023  4315 solver.cpp:229] Iteration 40, loss = 6.55651e-07
I0113 15:02:07.125046  4315 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:07.125051  4315 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I0113 15:02:07.125071  4315 solver.cpp:245]     Train net output #2: rpn_cls_loss = 1.19209e-07 (* 1 = 1.19209e-07 loss)
I0113 15:02:07.125074  4315 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:07.125092  4315 sgd_solver.cpp:106] Iteration 40, lr = 0.0001
I0113 15:02:13.131383  4315 solver.cpp:229] Iteration 60, loss = 0.010118
I0113 15:02:13.131407  4315 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:13.131412  4315 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I0113 15:02:13.131429  4315 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.0202161 (* 1 = 0.0202161 loss)
I0113 15:02:13.131433  4315 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:13.131438  4315 sgd_solver.cpp:106] Iteration 60, lr = 0.0001
I0113 15:02:19.142662  4315 solver.cpp:229] Iteration 80, loss = 7.51021e-06
I0113 15:02:19.142685  4315 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:19.142689  4315 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I0113 15:02:19.142709  4315 solver.cpp:245]     Train net output #2: rpn_cls_loss = 8.10626e-06 (* 1 = 8.10626e-06 loss)
I0113 15:02:19.142712  4315 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0 (* 1 = 0 loss)
I0113 15:02:19.142716  4315 sgd_solver.cpp:106] Iteration 80, lr = 0.0001
./experiments/scripts/faster_rcnn_end2end_june.sh: line 40:  4315 Floating point exception(core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/faster_rcnn_end2end/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}
irobot@irobot:~/py-faster-rcnn$ 

The loss values are very low and even zero!!! I think the floating point exception error comes up because of the very low loss values. Is there any way or idea to solve this problem? Somebody gave me an advice that I have too small dataset(training:268, test:31) so I made a bigger dataset(training:880, test:120) but this problem is not solved yet.

I think that maybe the bounding box is not properly generated. What should I modify? I checked the june.py and june_eval.py, but I couldn't find some error there.

Because I'm very new in this field and even in the computer vision, I'm very confused now.... Help me, brothers!

June-Jo commented 7 years ago

I don't know why the floating exception error comes up, but I have expanded my dataset, and now I don't get it.

Soda-Wong commented 7 years ago

@HyunJun-Jo Hi~ have you solvered the problem?

indsak commented 7 years ago

Hi When i give for fast rcnn training i get the following error

File "/home/alpha/fast-rcnn/py-faster-rcnn/tools/../lib/datasets/fishclassify.py", line 118, in rpn_roidb if int(self._year) == 2007 or self._image_set != 'test': AttributeError: 'fishclassify' object has no attribute '_year'

I checked the basketball.py in the basketball project. There also _year is not declared in def init. Pls advice me how to overcome this error.

yanxp commented 7 years ago

@HyunJun-Jo Hello , Have you solved the problem?

Ram-Godavarthi commented 6 years ago

HI Guys, While training the network on my own data.

I got this error..

What is the solution for this??

I0607 11:45:46.728519 2386 net.cpp:283] Network initialization done. I0607 11:45:46.728889 2386 solver.cpp:60] Solver scaffolding done. Loading pretrained model weights from ./data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel I0607 11:45:48.512673 2386 net.cpp:816] Ignoring source layer data F0607 11:45:48.516479 2386 net.cpp:829] Cannot copy param 0 weights from layer 'conv3'; shape mismatch. Source param shape is 384 256 3 3 (884736); target param shape is 512 256 3 3 (1179648). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer. Check failure stack trace: Aborted (core dumped)

I have 2 classes. I have images of size 512 * 512..

Please help someone who knows this..