Open ThorPham opened 6 years ago
1,download data https://www.kaggle.com/c/imagenet-object-detection-challenge/data
2,unzip it to your path
3,modify code, change ILSVRC_dataset_path
to your path
i just have completed download imagenet. But in your code i don't understand function input_generator(). Can you explain output of function. I try runing code but nothing happen: Thank you so much. Hope you can help me.
The code is updated to a newer one which may not work, I have changed it back, you can try it now.
input_generator() will list file names in /Annotations/DET/train/*.txt, base one those filenames find the annotation file and jpg file with the names, and then parse annotation, load image and prepare batch.
Woa!! now it's working. But i have edit a little function parse_label() ,i change output of this function (height, width) by (w_scale,h_scale). because when i run code if no change this , it have error : image mode wrong in function produce_batch .
ah yes, you are right, I didn't revert utils.py,
Thank you so much . when i training appear Traceback . What is happen ?
Epoch 1/800
6/1000 [..............................] - ETA: 11:45:00 - loss: 991.5018 - scores1_loss: 5.7656 - deltas1_loss: 985.7362 parse label or produce batch failed: for: ILSVRC2014_train_0003/ILSVRC2014_train_00037587
Traceback (most recent call last):
File "
Can you explan function unmap ? I don't understand. What it doing? and how to work ?
full_labels = unmap(labels, total_anchors, inds_inside, fill=-1)
we only select some of labels to train, in this line unmap will map a subset of labels (e.g. length=128) to full set(length = anchors number), keep selected labels 0 and 1, others are all -1
Thank you so much . when i training appear a lot of traceback. Model train correct ? what is happen ? https://photos.google.com/search/_tra_/photo/AF1QipMOGEVShy2h-sQvYIkxbqK6uo0vShnFPP0yL9k
you can ignore those things, I can't see picture, it maybe the data processing problem, if error happen that image will ignore and model will continue, there are many images, so I didn't fix that probelm yet.
i training 8 hour but all image have waring parse label or produce batch failed: for: ILSVRC2014_train_0003/ILSVRC2014_train_00037911 "ValueError: attempt to get argmax of an empty sequence" . How can i fix it. https://drive.google.com/file/d/1Bm8f_DsG8s_XDcESpxanIUdA2Ik1f_oo/view?usp=sharing
this code is used in explanation in blog so it is really slow, I use another branch to train, you may try that.
git checkout multithread
python gen_fc_map.py #generate feature map and store to disk, this will take 30 hrs....
python RPN.py # use feature map directly, this is super fast.
remember replace file path. this branch will first generate feature map and store it to disk, and then use use feature map directly in RPN.
if you don't want wait too long, you can use partial dataset, just run gen_fc_map.py few hours and kill it, then RPN will only use generated feature maps to train.
Thank you. i want to know how to training RPN and don't care accuracy or time. I have another question: in function produce_batch() why batch_inds=(batch_inds / k).astype(np.int). bath_inds is index in label, why you divide K (K is number of anchor)
get selected anchor's "pixel" in feature map, check in my post
with the down sampled anchors, we now need calculate feature map point position for each anchor sample, and use that position to form our mini-batch. for example, an anchor sample have index 150, divided by 9, get integer 16, this 16 represented a point (1,2), the second row, third column point in feature map.
Thank you so much! How to change lambda in loss function. I want to change weight of classifier and regression in loss function.
check loss_weights
in keras
Thank you so much. Can you add friend me.? You have sky or facebook ?
You can add "Jiankang Dong" in facebook, but I rarely use it... you know, in China facebook is not popular. Anyway you can contact me in mail (in blog bottom).
BTW, how did you find my blog? I try to google it, but no result can link to my page.
I try search a lot of project about Faster RCNN in github. And i saw your blog. I appreciate that you have helped me a lot. Sorry for asking you a lot of questions. in your post : feature_map_tile = Input(shape=(None,None,1536)). what is input ? That is feature map of each anchor (sliding window size 3x3 in feature map) ? and shape of Input and convolution_3x3 ? as your example if image size : (333, 500, 3) then input shape (3,3,156). That' right ?
image size : (333, 500, 3) --> feature map size (9, 14, 1536) ---> slice this to many 3x3 tiles --> (3,3,1536) ----> feed to input
Can you explain: model = Model(inputs=[feature_map_tile], outputs=[output_scores, output_deltas]) feature_map_tile shape =(9,14,1536) but when training using fit_generator(input_generator()) , first parameter of input_generator is batch_tiles have shape (3,3,1536) . Why it different size ?
(9,14,1536) is sliced to many (3,3,1536)
that's right. But how to training. because i think when training we must take same size between input and method fit in keras. that' mean batch_tiles in input_generator() must shape (9,14,1536). Sorry i'm newbie in deep learning
input tensor shape is (3,3,1536), match with tiles, and you can think about it, different image will generate different feature map, not always (9,14,1536), that's not able to train, so need process to unify shape
Thank you. but what is shape output after convolution_3x3 = Conv2D( filters=512, kernel_size=(3, 3), name="3x3"
still (3,3,1536), check the "SAME padding" conception.
i still don't understand why input shape((9,14,1536) but training with shape(3,3,1536)
input shape is (3,3,1536)
I just have read part 2 of your post. i see when you predict res=rpn_model.predict(feature_map). But feature map have shape (14,9,1536). different input shape when training (3,3,1536).is that right?
aha, that is the key point, when training time, input is (3,3,1536), and we are using conv2d, all paramaters are shared, so when predict time, the bigger size will "covolutional slide" using those paramaters, and produce (14,9,1536), this is designed to do that.
in another word,conv2d have ability to accept non-fixed size input
Thank you so so much
Hi ThorPham/ dongjk, can u tell me what exactly u modified in the beginning to parse_lable() to start training? I also experienced running rpn.py but only stopped at 1/800 epochs (like nothing happens). Thanks.
@noelcodes 1/800 stucking means model is waiting data, and data blocked in generator, problem should happen in data set path, after data set unzipped, it should contains:
ILSVRC/Annotations/DET/train (this folder contains all annotations, generator will parse and get bt boxes)
maybe you can check it first.
@noelcodes Are you Vietnamese? I training only CPU . When i try to training i down number of steps_per_epoch=1000, epochs=800 to 50 and 20.
Image-net did not approve my download request, that's why I am using my own dataset. Just want to clarify the format: ILSVRC/ImageSets/DET/ should contain categorynames.txt like attached, and inside (pls open attached files) should look like that too? If I'm wrong, please attach a sample here, so that I have reference to recreate my own version. Thanks, I'm not Vietnamese.
the file is not categorynames.txt, it have name train_1.txt, train_2.txt, etc..
Hi Dongjk, Thank you for the wonderful blog. It has helped me a lot to actually understand the implementation. I can see from your code that you are trying to perform alternating training, which is first training the RPN and then fine-tuning the detector module. As there are no pre-trained weights for this training RPN it is taking a huge amount of time. I had started the training at 5:00 Pm and next day morning at 11:00 Am I could not see the first epoch starting. I am using GPU GeForce GTX 1080. I am using PASCAL VOC2012. Can you suggest something?
Thank you.
Can you show me step by step training model RPN. I run file RPN.py but nothing happen. Thank you so much.