How to training model RPN ?

ThorPham commented 6 years ago

Can you show me step by step training model RPN. I run file RPN.py but nothing happen. Thank you so much.

dongjk commented 6 years ago

1，download data https://www.kaggle.com/c/imagenet-object-detection-challenge/data 2，unzip it to your path 3，modify code, change ILSVRC_dataset_path to your path

ThorPham commented 6 years ago

i just have completed download imagenet. But in your code i don't understand function input_generator(). Can you explain output of function. I try runing code but nothing happen: Thank you so much. Hope you can help me.

dongjk commented 6 years ago

The code is updated to a newer one which may not work, I have changed it back, you can try it now.

dongjk commented 6 years ago

input_generator() will list file names in /Annotations/DET/train/*.txt, base one those filenames find the annotation file and jpg file with the names, and then parse annotation, load image and prepare batch.

ThorPham commented 6 years ago

Woa!! now it's working. But i have edit a little function parse_label() ,i change output of this function (height, width) by (w_scale,h_scale). because when i run code if no change this , it have error : image mode wrong in function produce_batch .

dongjk commented 6 years ago

ah yes, you are right, I didn't revert utils.py,

ThorPham commented 6 years ago

Thank you so much . when i training appear Traceback . What is happen ? Epoch 1/800 6/1000 [..............................] - ETA: 11:45:00 - loss: 991.5018 - scores1_loss: 5.7656 - deltas1_loss: 985.7362 parse label or produce batch failed: for: ILSVRC2014_train_0003/ILSVRC2014_train_00037587 Traceback (most recent call last): File "", line 15, in input_generator tiles, labels, bboxes = produce_batch(img_path+line.split()[0]+'.JPEG', gt_boxes, scale) File "", line 52, in produce_batch gt_argmax_overlaps = overlaps.argmax(axis=0) ValueError: attempt to get argmax of an empty sequence 10/1000 [..............................] - ETA: 10:32:14 - loss: 872.1376 - scores1_loss: 5.3085 - deltas1_loss: 866.8291

ThorPham commented 6 years ago

Can you explan function unmap ? I don't understand. What it doing? and how to work ?

dongjk commented 6 years ago

full_labels = unmap(labels, total_anchors, inds_inside, fill=-1) we only select some of labels to train, in this line unmap will map a subset of labels (e.g. length=128) to full set(length = anchors number), keep selected labels 0 and 1, others are all -1

ThorPham commented 6 years ago

Thank you so much . when i training appear a lot of traceback. Model train correct ? what is happen ? https://photos.google.com/search/_tra_/photo/AF1QipMOGEVShy2h-sQvYIkxbqK6uo0vShnFPP0yL9k

dongjk commented 6 years ago

you can ignore those things, I can't see picture, it maybe the data processing problem, if error happen that image will ignore and model will continue, there are many images, so I didn't fix that probelm yet.

ThorPham commented 6 years ago

i training 8 hour but all image have waring parse label or produce batch failed: for: ILSVRC2014_train_0003/ILSVRC2014_train_00037911 "ValueError: attempt to get argmax of an empty sequence" . How can i fix it. https://drive.google.com/file/d/1Bm8f_DsG8s_XDcESpxanIUdA2Ik1f_oo/view?usp=sharing

dongjk commented 6 years ago

this code is used in explanation in blog so it is really slow, I use another branch to train, you may try that.

git checkout multithread
python gen_fc_map.py  #generate feature map and store to disk, this will take 30 hrs....
python RPN.py  # use feature map directly, this is super fast.

remember replace file path. this branch will first generate feature map and store it to disk, and then use use feature map directly in RPN.

if you don't want wait too long, you can use partial dataset, just run gen_fc_map.py few hours and kill it, then RPN will only use generated feature maps to train.

ThorPham commented 6 years ago

Thank you. i want to know how to training RPN and don't care accuracy or time. I have another question: in function produce_batch() why batch_inds=(batch_inds / k).astype(np.int). bath_inds is index in label, why you divide K (K is number of anchor)

dongjk commented 6 years ago

get selected anchor's "pixel" in feature map, check in my post

with the down sampled anchors, we now need calculate feature map point position for each anchor sample, and use that position to form our mini-batch. for example, an anchor sample have index 150, divided by 9, get integer 16, this 16 represented a point (1,2), the second row, third column point in feature map.

ThorPham commented 6 years ago

Thank you so much! How to change lambda in loss function. I want to change weight of classifier and regression in loss function.

dongjk commented 6 years ago

check loss_weights in keras

ThorPham commented 6 years ago

Thank you so much. Can you add friend me.? You have sky or facebook ?

dongjk commented 6 years ago

You can add "Jiankang Dong" in facebook, but I rarely use it... you know, in China facebook is not popular. Anyway you can contact me in mail (in blog bottom).

BTW, how did you find my blog? I try to google it, but no result can link to my page.

ThorPham commented 6 years ago

I try search a lot of project about Faster RCNN in github. And i saw your blog. I appreciate that you have helped me a lot. Sorry for asking you a lot of questions. in your post : feature_map_tile = Input(shape=(None,None,1536)). what is input ? That is feature map of each anchor (sliding window size 3x3 in feature map) ? and shape of Input and convolution_3x3 ? as your example if image size : (333, 500, 3) then input shape (3,3,156). That' right ?

dongjk commented 6 years ago

image size : (333, 500, 3) --> feature map size (9, 14, 1536) ---> slice this to many 3x3 tiles --> (3,3,1536) ----> feed to input

ThorPham commented 6 years ago

Can you explain: model = Model(inputs=[feature_map_tile], outputs=[output_scores, output_deltas]) feature_map_tile shape =(9,14,1536) but when training using fit_generator(input_generator()) , first parameter of input_generator is batch_tiles have shape (3,3,1536) . Why it different size ?

dongjk commented 6 years ago

(9,14,1536) is sliced to many (3,3,1536)

ThorPham commented 6 years ago

that's right. But how to training. because i think when training we must take same size between input and method fit in keras. that' mean batch_tiles in input_generator() must shape (9,14,1536). Sorry i'm newbie in deep learning

dongjk commented 6 years ago

input tensor shape is (3,3,1536), match with tiles, and you can think about it, different image will generate different feature map, not always (9,14,1536), that's not able to train, so need process to unify shape

ThorPham commented 6 years ago

Thank you. but what is shape output after convolution_3x3 = Conv2D( filters=512, kernel_size=(3, 3), name="3x3"

dongjk commented 6 years ago

still (3,3,1536), check the "SAME padding" conception.

ThorPham commented 6 years ago

i still don't understand why input shape((9,14,1536) but training with shape(3,3,1536)

dongjk commented 6 years ago

input shape is (3,3,1536)

ThorPham commented 6 years ago

I just have read part 2 of your post. i see when you predict res=rpn_model.predict(feature_map). But feature map have shape (14,9,1536). different input shape when training (3,3,1536).is that right?

dongjk commented 6 years ago

aha, that is the key point, when training time, input is (3,3,1536), and we are using conv2d, all paramaters are shared, so when predict time, the bigger size will "covolutional slide" using those paramaters, and produce (14,9,1536), this is designed to do that.

in another word，conv2d have ability to accept non-fixed size input

ThorPham commented 6 years ago

Thank you so so much

noelcodes commented 6 years ago

Hi ThorPham/ dongjk, can u tell me what exactly u modified in the beginning to parse_lable() to start training? I also experienced running rpn.py but only stopped at 1/800 epochs (like nothing happens). Thanks.

dongjk commented 6 years ago

@noelcodes 1/800 stucking means model is waiting data, and data blocked in generator, problem should happen in data set path, after data set unzipped, it should contains:

ILSVRC/ImageSets/DET (this folder contains 200 files, each files is a category, so the class label is parse from file name. file is text format and contains all file names in each category and we can use in mini-batch )
ILSVRC/Data/DET/train (this folder contains all jpgs, name can find in upper category files)
ILSVRC/Annotations/DET/train (this folder contains all annotations, generator will parse and get bt boxes)

maybe you can check it first.

ThorPham commented 6 years ago

@noelcodes Are you Vietnamese? I training only CPU . When i try to training i down number of steps_per_epoch=1000, epochs=800 to 50 and 20.

noelcodes commented 6 years ago

Image-net did not approve my download request, that's why I am using my own dataset. Just want to clarify the format: ILSVRC/ImageSets/DET/ should contain categorynames.txt like attached, and inside (pls open attached files) should look like that too? If I'm wrong, please attach a sample here, so that I have reference to recreate my own version. Thanks, I'm not Vietnamese.

dress.txt jeans.txt

dongjk commented 6 years ago

the file is not categorynames.txt, it have name train_1.txt, train_2.txt, etc..

PallawiSinghal commented 5 years ago

Hi Dongjk, Thank you for the wonderful blog. It has helped me a lot to actually understand the implementation. I can see from your code that you are trying to perform alternating training, which is first training the RPN and then fine-tuning the detector module. As there are no pre-trained weights for this training RPN it is taking a huge amount of time. I had started the training at 5:00 Pm and next day morning at 11:00 Am I could not see the first epoch starting. I am using GPU GeForce GTX 1080. I am using PASCAL VOC2012. Can you suggest something?

Thank you.

dongjk / faster_rcnn_keras

How to training model RPN ? #2