rbgirshick / fast-rcnn

Fast R-CNN
Other
3.35k stars 1.57k forks source link

Train a fast-rcnn CovNet on another dataset #11

Closed SunShineMin closed 9 years ago

SunShineMin commented 9 years ago

Hi, I am trying to train a fast-rcnn CovNet on another dataset ,and there is something wrong with the snapshot when it comes to net.params['bbox_pred'][0].data[...] = (net.params['bbox_pred'][0].data *self.bbox_stds[:, np.newaxis]) ValueError: operands could not be broadcast together with shapes (84,4096) (12,1) . I think that maybe the difference of the number of classes between my dataset and VOC is account for the issues . If is it necessary to chang some parameters of the pretrained_model? If it is ,how should I change that?

rbgirshick commented 9 years ago

It looks like you have 3 classes. In the train.prototxt and test.prototxt files that you're using, you'll need to change num_output from 21 to 3 in the cls_score layer and from 84 to 12 in the bbox_pred layer. You'll also need to change num_classes from 21 to 3 in the Python layer that provides data to the net (the very first layer).

SunShineMin commented 9 years ago

Thank you very much for your reply and I have solved the problem with your help.But when I increased the number of classes to 12 ,an AssertionError occured in imdb.append_flipped_images----assert (boxes[:, 2] >= boxes[:, 0]).all() .Is there something wrong with my proposals? But I have used these proposals in RCNN successfully. @rbgirshick PS: I found that one of the boxes[:,0]=65535 that results in the error ,but how does this issue come ?

nascimentocrafael commented 9 years ago

Hi SunShineMin,

Could you please share the steps to train fast-rcnn ConvNet on another dataset? Also, I'm having problems downloading the PASCAL dataset, it seems that the links are broken. Did you have some problem downloading the database?. If you help me I'll be extremely thankful. If you prefer, you could e-mail me.

Thank you in advance.

ssakhavi commented 9 years ago

@SunShineMin @rbgirshick @nascimentocrafael Some of the forks from this repo are actually trying (or succeeded) to do this.

I refer you to: https://github.com/EdisonResearch/fast-rcnn @EdisonResearch https://github.com/raingo/fast-rcnn @raingo

zeyuanxy commented 9 years ago

@SunShineMin @rbgirshick @nascimentocrafael @ssakhavi I have succeeded training it on INRIA Person, see https://github.com/EdisonResearch/fast-rcnn for more details. Thanks!

zeyuanxy commented 9 years ago

How to Train Fast-RCNN on Another Dataset

https://github.com/EdisonResearch/fast-rcnn/tree/master/help/train

SunShineMin commented 9 years ago

@ssakhavi @zeyuanxy @rbgirshick Thank you very much for above help. I have trained Fast R-CNN on my own dataset, but the mAP is lower than R-CNN . I changed the learning rate and lterations, with little improvement . Which arguments can I modify else to improve mAP ?

SunShineMin commented 9 years ago

@nascimentocrafael I have sent you an email attached with VOC dataset , please have a check

zeyuanxy commented 9 years ago

@SunShineMin Which dataset did you train Fast-RCNN on? I am so glad to meet a Tsinghua peer here.

github-anurag commented 9 years ago

@rbgirshick @zeyuanxy @SunShineMin

How do we go about training for the negative/background class? After reading the paper, I understand that if the IoU between the bbox proposals and our labelled bbox is < 0.3 we consider those bboxes to be negative. Do we specify these bboxes as background class for each image while creating the training set imdb's?

I imagine a caffe layer can do this .. does something like that exist already? thanks

hengck23 commented 9 years ago

@ github-anurag If you look at " fast-rcnn/lib/datasets/pascal_voc.py", def selective_search_roidb(self): if int(self._year) == 2007 or self._image_set != 'test': line 124: gt_roidb = self.gt_roidb() line 125: ss_roidb = self._load_selective_search_roidb(gt_roidb) line 126: roidb = datasets.imdb.merge_roidbs(gt_roidb, ss_roidb)

It means that during training, the ground truth annotation is first read in line 124. These are positive samples. Then selective search boxes are loaded, those that do not overlap with the ground truth are marked as backgrounds (aka negative samples), as in line 125. Finally, both positive and negative samples are combined into roidb and used for training in line 126.

note that "selective_search_roidb()" is always called when you use imdb.roidb(), see " fast-rcnn/lib/datasets/imdb.py:" @property def roidb(self): imdb is your train database and roidb is the roi of your train database.

in short, you need to supply ground truth annotation and a set of box list (e.g. from selective search) for training. Hope this explanation helps.

anuchandra commented 9 years ago

@zeyuanxy @SunShineMin

Thanks for the instructions on how to train on your own dataset. I tried to follow them using one imagenet class and a background. But I'm getting an error which is exactly the same as SunShineMin did.

As background, I computed the selective search proposals. Then ran the train_net.py script. Any suggestions? SunShineMin, how did you fix this? Anyone else have any suggestions?

Traceback (most recent call last): File "./tools/train_net.py", line 80, in roidb = get_training_roidb(imdb) File "/home/ubuntu/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 107, in get_training_roidb imdb.append_flipped_images() File "/home/ubuntu/fast-rcnn/tools/../lib/datasets/imdb.py", line 104, in append_flipped_images assert (boxes[:, 2] >= boxes[:, 0]).all() AssertionError ubuntu@ip-172-31-8-77:~/fast-rcnn$

github-anurag commented 9 years ago

@anuchandra This assertion error is due to some problems with your annotation files. The annotations have the bounding box informations as column 0 - xmin, column 1- ymin , column 2- xmax, column 3 as ymax. The assertion fails as you have xmin > xmax for one or possibly more of your annotation files. Try to fix this.

YihangLou commented 9 years ago

@anuchandra I know what caused this error. Because I have met this error too and the solved. The reason is probably due to the "-1" operation during reading the anontation file or your selectivesearchBox.mat file. This assert expression is to check whether the horizontal flipped is success or not. According to this flipped operation, x2 will be larger than x1. You should notice your pixel coordinates are whether starting from 0 or 1, if you coordiantes is starting from 0, do not perform "-1" operation which you can see in the pascal_voc.py, or the x coordinates would be negative number that will lead to this error. And this operation you can find in function _load_pascal_annotation() and _load_selective_search_roidb. :)

anuchandra commented 9 years ago

@YihangLou

You're absolutely right. I added the checks for negative x1 and y1 in _load_pascal_annotation and this fixed a couple of issues. Firstly, the assert (boxes[:, 2] >= boxes[:, 0]).all() were no longer called. The second big thing was I was getting nans appearing in my loss_bbox during training. These have gone as well. I carefully compared the annotations which were being made negative x1s with those caught by the assert. It seems the assert wasn't catching all of them. Thanks!

sunshineatnoon commented 9 years ago

@anuchandra Have you solved the assertion problem? I trained on image net and have the same problem, by deleting -1 operation, I still have the same problem.

anuchandra commented 9 years ago

@sunshineatnoon I kept the -1 operations. Instead I added checks in _load_pascal_annotation() In_load_pascal_annotation(), after these lines,

# Make pixel indexes 0-based
x1 = float(get_data_from_tag(obj, 'xmin')) - 1
y1 = float(get_data_from_tag(obj, 'ymin')) - 1
x2 = float(get_data_from_tag(obj, 'xmax')) - 1
y2 = float(get_data_from_tag(obj, 'ymax')) - 1

I added:

if x1 < 0:
        x1 = 0
if y1 < 0:
        y1 = 0

That took care of annotations where x1 or y1 were 0.

sunshineatnoon commented 9 years ago

@anuchandra Thanks for you help. I added those lines too, but I still get this assert error, did I miss anything else here?

anuchandra commented 9 years ago

@sunshineatnoon Similarly in addition, you could try checking the -1 operation in _load_selective_search_roidb. See @YihangLou suggestion above. This didn't prove an issue for my dataset but it could be with yours.

 box_list.append(raw_data[i][:, (1, 0, 3, 2)] - 1)
IchibanKanobee commented 9 years ago

I have a couple of questions about training the fast r-cnn on another dataset, as it is described in https://github.com/EdisonResearch/fast-rcnn/tree/master/help/train

Do I put negative and positive images under the same "INRIA/data/Images" folder, or separate them into sub-folders, like "INRIA/data/Images/positive" and "INRIA/data/Images/negative". The more general question is, if I have several categories, do I put all of their images into the same "INRIA/data/Images" folder, or into separate "INRIA/data/Images/Category1"... "INRIA/data/Images/CategoryN" sub-folders?

The second question is, do I need annotations for negative samples as well, or only for positive ones?

Thanks a lot for your help!

sid027 commented 9 years ago

@IchibanKanobee were you able to figure out if one has to create subfolders?

IchibanKanobee commented 9 years ago

I am still not able to train on my own data set, so I might be wrong, but my understanding is that it is not necessary to separate the images into sub folders. At least the selective_search.py is written in a way that all images are under the same folder for generating rois. I also understand that annotations are needed for negative samples as well, but I am not sure if the region of interest in this case is the whole image with the center pointing to the center of the image, or it is [0, 0, 0, 0] with the center pointing to [0,0].

catsdogone commented 9 years ago

@zeyuanxy I try to train on my database as it is describe in https://github.com/EdisonResearch/fast-rcnn/tree/master/help/train, while there is something wrong: ... File "/usr/local/fast-rcnn/tools/../lib/datasets/imdb.py", line 167, in create_roidb_from_box_list argmaxes = gt_overlaps.argmax(axis=1) ValueError: attempt to get argmax of an empty sequence

Thanks a lot for your help!

sid027 commented 9 years ago

@IchibanKanobee I think it is the corner.I think annotations are needed for all classes except the background.I am trying to train the detector for the ILSVRC2012 imagenet dataset.Did you use the same dataset?

zeyuanxy commented 9 years ago

@catsdogone It seems that there is some problem with your selective search file.

IchibanKanobee commented 9 years ago

@sid027 I am following the training instructions in readme.md in help folder and using the VGG_CNN_M_1024 model.

If I don't have annotations for background files, I am getting an error: File "./tools/train_net.py", line 80, in roidb = get_training_roidb(imdb) File "/home/GitHub/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 111, in get_training_roidb rdl_roidb.prepare_roidb(imdb) File "/home/GitHub/fast-rcnn/tools/../lib/roi_data_layer/roidb.py", line 23, in prepare_roidb roidb[i]['image'] = imdb.image_path_at(i) IndexError: list index out of range

That's why I thought the annotations for background files are required. But I didn't get the training working, so I might be wrong.

IchibanKanobee commented 9 years ago

@zeyuanxy Do you include background files in the training set? If they are included in the training set, do you include annotations for the background files? And, if you do include annotations for the background files, what are the bounding boxes - the whole image, or [0,0,0, 0]?

Thanks!

anuchandra commented 9 years ago

@IchibanKanobee The easiest way to figure out how fastrcnn works is to take the working PASCAL VOC 2007 example and strip it down to one image in one class. You can use the Edison training guide - it'll help you do that.

IchibanKanobee commented 9 years ago

@anuchandra Thank you for the tip! Can you please point me to the working PASCAL VOC 2007 sample?

anuchandra commented 9 years ago

The instructions for the working example are on the home page.

https://github.com/rbgirshick/fast-rcnn#beyond-the-demo-installation-for-training-and-testing-models

All the info you need is on these discussion forums.

sunshineatnoon commented 9 years ago

@zeyuanxy Hi~ Thanks to your INRIA docs, I successfully train a model on INRIA. I don't have matlab on my computer, so I use dlib's selective search. I want to create a version that can be ran without matlab. So I need to wrtire an evaluation script in python. But I don't know how to evaluate prediction precision on INRIA. Could you pls give me some hints? Thanks.

catsdogone commented 9 years ago

@sunshineatnoon Maybe this is what you need: https://github.com/rbgirshick/fast-rcnn/pull/33/files

sunshineatnoon commented 9 years ago

@anuchandra Thx, getting rid of the -1 operation solves my problem.

sunshineatnoon commented 9 years ago

@catsdogone Thx, I wrote some test code myself.

yawadugyamfi commented 9 years ago

When passing argument for training the INRIA person dataset what does inria_train represent in the code below. Is it the train.mat file generated from selective search? or the folder containing images of your training data?

./tools/train_net.py --gpu 0 --solver models/VGG_CNN_M_1024/solver.prototxt \ --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb inria_train

zsc-tju commented 8 years ago

@zeyuanxy @rbgirshick I have a question,now that we set the scale=600 in the config.py, why in the roi_data_layer\layer.py,still to reshape the top[0] to 100x100,and in the test phase the iunput size is 227x227,please enlighten me,thank you very much!

siddharthm83 commented 8 years ago

@anuchandra @SunShineMin @sunshineatnoon I was working on the imagenet dataset and I got the assertion error in imdb.append_flipped_images----assert (boxes[:, 2] >= boxes[:, 0]).all() From: http://image-net.org/download-bboxes Remark: In the bounding box annotations, there are two fields(<width> and <height>) indicating the size of the image. The location and size of a bounding box in the annotation file are relative to this size. However, this size may not be identical to the real image size in the downloaded package. (The reason is that the size in the annotation file is the displayed size in which the image was shown to an annotator). Therefore to locate the actual pixels on the original image, you might need to rescale the bounding boxes accordingly.

It appears that one would need to normalize values in xml to the true size of the image.

aragon111 commented 8 years ago

@IchibanKanobee how did you fixed the "IndexError: list index out of range"? I'm stucked there since two days trying to train my own dataset.

IchibanKanobee commented 8 years ago

@attiliotnt Make sure to delete the old cache folder before starting the new training

aragon111 commented 8 years ago

Thank you for the answer @IchibanKanobee. My problem was that the annotation files I got from Matlab didn't have the fields folder and name so I edit by myself.

anuchandra commented 8 years ago

@siddharthm83 This is useful to know for the future.

MinaRe commented 8 years ago

Dear All finaly i run fast-rcnn on my dataset and got acceptable results on object detection task, but I was wondering to know about the segmentation task? has anyone done it before?

aragon111 commented 8 years ago

Hello, I would like to ask which script you chose to run for detection? I tried to use my webcam but the results are not so good.

MinaRe commented 8 years ago

@attiliotnt I apply fast-rcnn for cancer cell detection.

aragon111 commented 8 years ago

@MinaRe very interesting. Do you use images or video for test?

MinaRe commented 8 years ago

3D images :) do you know about the segmentation task by fast rcnn?

aragon111 commented 8 years ago

@MinaRe No, unfortunately. I worked just with detection.

aragon111 commented 8 years ago

I would like to ask an advice about the model to use during the training. I need to detect some object underwater and I used CaffeNet.v2.caffemodel. The result I obtained are really bad until now. Any advice?

MinaRe commented 8 years ago

I did training on Asus-G751JT on CPU.

aragon111 commented 8 years ago

I have a little doubt. During the labeling of the training set, is it possible to label more times items (ROIs) in the same image (of the same class)?