rbgirshick / py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
Other
8.1k stars 4.11k forks source link

py-faster-rcnn on new dataset #27

Open leefionglee opened 8 years ago

leefionglee commented 8 years ago

Hi all: I would like to train py-faster-rcnn on my own dataset, but what is exactly the data format? i.e. the images, annotations, train, val? Can anyone post an example of annotation file here? As I found one post here: https://github.com/zeyuanxy/fast-rcnn/tree/master/help/train, in which the annotation is text, while I also found PASCAL is XML. What exactly it is? Any reference blogs or tutorials will be highly appreciated. Thanks

athus1990 commented 8 years ago

pascal has an xml file...look at the original VOCdevkit code(readme and helper function)...Xml basically can be converted to a struct in matlab and u can see that it has information about the class name,bounding boxes etc..

yileo19920925 commented 8 years ago

hi~ @leefionglee I trained fastrcnn on other dataset lastweek~ weather xml or txt in annotation does not matter. you can see in path /fast-rcnn/lib/datasets/inria.py there is a func named _load_inria_annotation to deal with annotation file :) while i used xml get annotation

kshalini commented 8 years ago

@yileo19920925 can you please share some insights on your training steps. is it as simple as getting the data in the format of VOC2007 trainval and test, and placing them into the right folder paths and just calling alt_opt.sh?

do we need to write any python code or modify any other files like FastrCNN training (factory.py etc?) thanks in advance

yileo19920925 commented 8 years ago

@kshalini hi I trained my datasets using fast(not er) r cnn..... BUT i think whether fast or faster rcnn require imdb to make train process In order to deal with imdb we need to write or modify pascal_voc.py The function of factory.py is to add this kind of datasets like pascal_voc

when i trained my data .It helped me a lot (i hope it will be positive to u) https://github.com/coldmanck/fast-rcnn/blob/master/README.md

about alt_opt. i did not used it when trained In fast rcnn u can use ./tools/train_net.py In faster rcnn ./tool/train_faster_rcnn_alt_opt.py may be more straightforward

leejiajun commented 8 years ago

@leefionglee @athus1990 @kshalini

i trained with ImageNet, but met an error in /lib/rpn/anchor_target_layer.py.

#error
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "./tools/train_faster_rcnn_alt_opt.py", line 132, in train_rpn
    max_iters=max_iters)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 136, in train_net
    model_paths = sw.train_model(max_iters)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 104, in train_model
    self.solver.step(1)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/rpn/anchor_target_layer.py", line 137, in forward
    gt_argmax_overlaps = overlaps.argmax(axis=0)
ValueError: attempt to get argmax of an empty sequence

when run at "only keep anchors inside the image", len(ids_inside) is equal to 0.

    # only keep anchors inside the image
    inds_inside = np.where(
        (all_anchors[:, 0] >= -self._allowed_border) &
        (all_anchors[:, 1] >= -self._allowed_border) &
        (all_anchors[:, 2] < im_info[1] + self._allowed_border) &  # width
        (all_anchors[:, 3] < im_info[0] + self._allowed_border)    # height
    )[0]

how to solver?

andrewliao11 commented 8 years ago

@leefionglee @kshalini Hi! I've train faster rcnn on imagenet (200 categories), hope this can help you! https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md

banxiaduhuo commented 8 years ago

@leejiajun Hi, leejiajun, Did you solve this problem?I have the same problem.

LiberiFatali commented 8 years ago

@leejiajun @banxiaduhuo Hi, did you solve the problem? I got the same issue when training ZF model on custom dataset

leejiajun commented 8 years ago

@banxiaduhuo @LiberiFatali

Because, the ratio of images width and height is too small or large. You could remove that images and solve this problem.

LiberiFatali commented 8 years ago

Right, some images have big ratio of width and height cause this.Thanks!

ck196 commented 8 years ago

@leejiajun How do you train imagenet dataset? Could you write some instructions for this step. Thank you.

leejiajun commented 8 years ago

@monkeykju all work focus on coding lib/datasets/pascal_voc.py and lib/datasets/factory.py you could read this, https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md

daf11865 commented 8 years ago

@leefionglee can you tell me why the training image ratio would cause such an error? is that because RPN produce no roi which can be considered gt under that ratio? if yes, why too big or too small image would effect this thank you.

karenyun commented 8 years ago

@leefionglee I do as the inria.py(two classes), but when i train my dataset, the machine always crash, i forget to prepare the negative images, this will cause the issue or there may other reason? such as: qq 20160315112658 How to make the negative images and annotation? thanks.

LiberiFatali commented 8 years ago

I remember that this py-faster-rccn doesn't used selective search

aragon111 commented 8 years ago

@MarkoArsenovic I wanna train the net on my own dataset. I don't understand how to modify the python files before training. I can't find any repo where those steps are explained.

deboc commented 8 years ago

Hi, Let's have a look there

aragon111 commented 8 years ago

Thanks @deboc . I used coldmanck instructions for training the fast rcnn. The problem is that there are few steps which are differents (in the construct of the IMDB file and editing the shell script)

deboc commented 8 years ago

Ok tools/train_net.py is in fact an old script for fast-rcnn. You can directly launch the training without any script : (example for alt_opt training) $cd <py-faster-rcnn folder> $./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name <model name> --weights <pretrained .caffemodel> --imdb <dataset name>_train

or just use tools/train_faster_rcnn_alt_opt.py

deboc commented 8 years ago

Ok we definitely miss a real tutorial dedicated to py-faster-rcnn with a simple dataset. Chances are your inria.py need 2 more methods (rpn_roidb & _load_rpn_roidb). I'm sorry not to remember how exactly I built mine, but if you want to finish the job have a look here

aragon111 commented 8 years ago

Hello @deboc , I tried running the training as you explained but after I typed

./train_faster_rcnn_alt_opt.py --gpu 0 --solver /models/pascal_voc/ZF/faster_rcnn_end2end/solver.prototxt --weights /data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel --imdb amphora_train

I got this error: train_faster_rcnn_alt_opt.py: error: unrecognized arguments: --solver /models/pascal_voc/ZF/faster_rcnn_end2end/solver.prototxt

I don't understand what is wrong with the solver I chose.

deboc commented 8 years ago

You are using the alt_opt script with a end2end solver, so it won't work.

For training alt_opt the argument is not --solver but --net_name which specify the model folder. Then train_faster_rcnn_alt_opt.py automatically looks for solvers in models//faster_rcnn_alt_opt/ where you should put solvers for alt_opt training.

aragon111 commented 8 years ago

Thank you @deboc . I understood now and I edited the faster_rcnn_test file as I have done in fast-rcnn. I got now an error about the stage1_rpn_train file (which is in the right directory).

I0421 19:04:00.501624 20093 solver.cpp:61] Creating training net from train_net file: models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt F0421 19:04:00.501646 20093 io.cpp:34] Check failed: fd != -1 (-1 vs. -1) File not found: models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt

aragon111 commented 8 years ago

@MarkoArsenovic did you figure it out how to train the py-faster-rcnn using another dataset?

aragon111 commented 8 years ago

@MarkoArsenovic I'm in the same situation...I found all the necessary steps for fast rcnn but still can't get any solution for py-faster-rcnn

deboc commented 8 years ago

I'm updating zeyuanxy's tutorial for py-faster-rcnn, I'll let you know

aragon111 commented 8 years ago

@deboc that sounds great! Thank you!

deboc commented 8 years ago

I wasn't aware of the last updates with the config, that puzzled me a little ;) Here is my contribution, I hope it will help: https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md

deboc commented 8 years ago

No you don't need to rename since the output dimension is not the same. Your issue looks really weird. Can you double check the 4 stage{something}train.pt files in your model folder ? The 84 shape in your error definitely seems to be an old 21x4...

nazneenrajani commented 8 years ago

Hi Can someone help me with running this ResNet model?

I modified the ResNet prototxt file to have the ROI proposals and was able to fine tune it successfully in 50K iterations on val set Train net output #3: rpn_loss_bbox = 0.0179207 (* 1 = 0.0179207 loss) Now I am trying to use the final caffe model along with the deploy.prototxt for ResNet modified as a test prototxt (removing lr params, top layer and appending the input data layer). However I keep getting this error. File "./lib/rpn/anchor_target_layer.py", line 116, in forward (all_anchors[:, 2] < im_info[0][1] ) & # width ValueError: operands could not be broadcast together with shapes (17100,) (600,800)

I printed a few debug statements but was unsuccessful in trouble shooting. Any help would be appreciated.

Anubhav7 commented 8 years ago

Hi,

Has anyone tried to parallelize the calls to im_detect. I am facing issues with Caffe Initialization when I make calls to the network. Exact error details are below

Background: I have pre-loaded the network and am passing the net parameter in pooled calls.

Error Statement:

F0428 16:30:55.227715 19959 syncedmem.hpp:18] Check failed: error == cudaSuccess (3 vs. 0) initialization error

leejiajun commented 8 years ago

@Anubhav7 Out of memory?I guess.

Anubhav7 commented 8 years ago

@leejiajun No, it's not one of those times :)

anas-899 commented 8 years ago

Hi @deboc, Thank you for the very helpful training instructions in: https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md Finally i successfully trained Inria dataset , But how i will do the test? I mean, I tried to use tools/test_net.py but it need test.prototxt file

sahuvaibhav commented 8 years ago

Hi, How did you do the refactoring got testset. it is not clear how to create symlinks for the testset. Please help

BenjaminTT commented 8 years ago

Hi all, Not sure that I am doing this the right way but it seems like the right places to ask some questions.

I am trying to get used to the training of the py-faster-rcnn and I have trained it with success on the B3DO dataset (though the results is only around 0.3 mAP) and my goal would be to perform object recognition using a Kinect on RGB an Depth separately. However, I have not come across articles dealing with object recognition using Depth only except some {bag of feature(HOG/ HDD) and classification} approaches. Is there any limitations I don't see that prevent considering using the faster RCNN on depth data only?

Also I tried to train the faster rcnn without using the pretrained models on imagenet. First on B3DO but success (I am assumed it was because of the size of the dataset), and then on pascal VOC (both end to end training and alt opt) but during the training part the loss does not seems to converge, and when I try to run the test_net.py I obtain an error whereas I could run it when I used the pretrained model : File "/home/benjamin/FastRCNN/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 148, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array So if I would like not to use the pretrained model is there any steps aside not include --weights in the trainings options?

PS: I am using the ZF model If I am breaking any rules that I am unaware of, please tell me Any comment would be useful, thanks!

tharuniitk commented 8 years ago

Hi @deboc, Looking at the link below I followed the steps you have given for training faster RCNN using alternate optimization scheme on custom dataset. https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md

I have 37 classes of data including background. I have changed the num_outputs accordingly. But I receive the following error again and again and can't find the mistake. I'll be extremely grateful if somebody can help me address this issue. im_proposals are calculated for all the training images and then an error is showing up as below in the attached image.

fasterrcnn error

deboc commented 8 years ago

Hi, First are you sure to have removed the cache ? $ cd $ rm data/cache/voc_2016_train_gt_roidb.pkl $ rm output/faster_rcnn_alt_opt/voc_2016train/vgg*

tharuniitk commented 8 years ago

Hi @deboc I have cleared the cache. Now I have successfully trained the model. But I am meeting an error while testing same as that @BenjaminTT above. I am attaching it below. Grateful for your help.

Traceback (most recent call last): File "./tools/test_net.py", line 105, in test_net(net, imdb, max_per_image=args.max_per_image, vis=args.vis) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 295, in test_net imdb.evaluate_detections(all_boxes, output_dir) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 341, in evaluate_detections self._do_python_eval(output_dir) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 304, in _do_python_eval use_07_metric=use_07_metric) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 157, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array

bityangke commented 8 years ago

I also encountered the same problem as @tharuniitk

tohnperfect commented 8 years ago

Hi @deboc,

I had followed your guidance but I got the following error,

File "./tools/train_faster_rcnn_alt_opt.py", line 210, in <module> cfg_from_file(args.cfg_file) File "/home/caffe/py-faster-rcnn/tools/../lib/fast_rcnn/config.py", line 263, in cfg_from_file _merge_a_into_b(yaml_cfg, __C) File "/home/caffe/py-faster-rcnn/tools/../lib/fast_rcnn/config.py", line 235, in _merge_a_into_b raise KeyError('{} is not a valid config key'.format(k)) KeyError: 'MODEL_DIR is not a valid config key' I have no idea what did I do wrong. Could you please give me some advice?

deboc commented 8 years ago

Hi @tohnperfect, cfg.MODEL_DIR has been added in February, maybe you need to update your repo ?

tohnperfect commented 8 years ago

@deboc I just downloaded this git last week.

deboc commented 8 years ago

It's MODELS_DIR.

deboc commented 8 years ago

@tharuniitk, @bityangke: I think you're using the test step of voc dataset, but you need to write your own for the new dataset. For my tutorial on inria-person dataset the test step was not implemented, which I have done now. You can refer to this commit

tohnperfect commented 8 years ago

@deboc I've checked the codes and they are the lastest version

deboc commented 8 years ago

@tohnperfect Yes sorry. Like I said the parameter you are looking for is in fact "MODELS_DIR", not "MODEL_DIR". I think it's just that :)

tohnperfect commented 8 years ago

I see. Thank you very much @deboc.

tohnperfect commented 8 years ago

@deboc Are you planning to write a tutorial about testing on new datasets as well? I guess it would be a really useful tutorial.

BenjaminTT commented 8 years ago

hi @deboc, if the dataset is with the same format as pascal VOC the test implemented works, I trained on the B3DO dataset (with RGB only) and it worked (I obtain a mAP of ~0.32 ). However, problems appear when I am not using the pretrained model. In this case, whether I train on B3DO or pascal VOC, the test fail and even during training the loss does not seem to converge. Is there any specific step to do in order to avoid using the pretrained models ?