How to train on my dataset

lsj910128 commented 6 years ago

Hi, I have some question about the dataset. I download the data folder and find there are 4 .txt inside the ImageSets/Main folder: 4train.txt, test.txt, train.txt, train_ORIGINAL_BACKUP (copy).txt. But in the Pascal VOC dataset, there are trainval.txt, train.txt, val.txt and test.txt inside the ImageSets/Main folder. So, how can I split my dataset?

Then when I run the code, I meet the problem:

Traceback (most recent call last): File "./tools/train_net.py", line 116, in max_iters=args.max_iters) File "/shenlab/lab_stor4/shujun/affordance-net-master/tools/../lib/fast_rcnn/train.py", line 171, in train_net model_paths = sw.train_model(max_iters) File "/shenlab/lab_stor4/shujun/affordance-net-master/tools/../lib/fast_rcnn/train.py", line 110, in train_model self.solver.step(1) File "/shenlab/lab_stor4/shujun/affordance-net-master/tools/../lib/rpn/proposal_target_layer.py", line 310, in forward gt_mask = mask_flipped_ims[gt_mask_ind] IndexError: list index out of range

Have you meet this problem before?

nqanh commented 6 years ago

You need file train.txt for training and test.txt for testing. Other files are just dumped files for debugging. File train.txt has both train + validation images.

The error shows something wrong with your groundtruth mask file. See $AffordanceNet_ROOT/utils for details on how to train on your dataset.

lsj910128 commented 6 years ago

Thank you for your reply. I make the groundtruth mask according to $AffordanceNet_ROOT/utils, and each object of one image has a mask. My dataset is the medical image dateset, which has 11 objects + 1 background and 11 mask(affordance) classes + 1 background. I can't find any problem of the grountruth mask so far. But I find your dataset that a image has at most 3 objects, and in my dataset, sometimes a image will has 7 objects. Dose this occurs the problem listed in last comment?

lsj910128 commented 6 years ago

Also could you recommend me a software to open the .sm file? Thanks.

thanhtoando commented 6 years ago

@lsj910128: .sm files are actually binary files. You can load and see the groundtruth mask by the following python code:

import cPickle
from PIL import Image

filename = 'any_file.sm'
with open(filename, 'rb') as f:
      seg_mask = cPickle.load(f)

seg_mask = (seg_mask*255).astype('uint8')
img = Image.fromarray(seg_mask).convert('RGB')
print "image size: ", img.size
img.show()

nqanh commented 6 years ago

The number of objects in one image is not the problem. I think the error occurs because the code reads a new class that doesn't exist in the dataset. Since your dataset has different number of object categories + affordance classes, you'll need to change the prototxt file to make it works.

Also, take a look at file dataset/pascal_voc.py, line 40. We define the name of the object class there.

lsj910128 commented 6 years ago

@thanhtoando Thanks for your reply.

lsj910128 commented 6 years ago

@nqanh Thanks a lot!

I have already change the train.prototxt and change the object class into my own in the file dataset/pascal_voc.py. But It still occur the problem.

I find there is a top: 'flipped' of the input-data layer in the train.prototxt. I'd like to know whether the value of 'flipped' is variable or fixed? When I use your data to train the code, I find the value of 'flipped' is flipped: 0.0; and when use my data, the value is flipped: 1.0. I think that is the problem. I haven't change other place in the train.prototxt.

Also, I'd like to know how to set the value of 'flipped' and where to set. /fast_rcnn/config.py, line 70, "__C.TRAIN.USE_FLIPPED = True"?

nqanh commented 6 years ago

The "flipped" is controlled by the __C.TRAIN.USE_FLIPPED flag in the config.py file. You can set __C.TRAIN.USE_FLIPPED = False to disable it. If the problem is still there, can you post the full output from the terminal?

nqanh commented 6 years ago

Also, remember to delete the cache file (.pkl file in cachefolder) when you change anything in the dataset IO. Otherwise, you'll load the not-up-to-date file.

lsj910128 commented 6 years ago

Hi, here is the full output from the terminal.

I change the flipped:True into flipped:False, but the problem is still here. Please see Use my dataset's log file, at the end of the file, when load the '255.bmp' image, the seg_mask_inds.shape is [7,2]. That means there are 7 objects in '255.bmp'. But actually there are two objects in '255.bmp' and , there are only 255_1_segmask.sm and 255_1_segmask.sm in the /GTsegmask_VOC_2012_train file.

Solving...
im_info:  [[  417.          1000.             1.14810562]]
seg_mask_inds_shape (1, 2)
seg_mask_inds:  [[ 195.    1.]]
flipped:  [[ 0.]]
im_ind: 195.0
im_ind_shape: ()
flipped:  0.0
flipped_shape:  ()
im_ind:  195
seg path: ./data/cache/GTsegmask_VOC_2012_train/195_1_segmask.sm
mask_ims_shape: 1
mask_flipped_ims_shape: 1
im_ind:  195
k:  0
gt_mask_ind:  0
gt mask ind:  0
im_info:  [[  498.          1000.             1.48148143]]
seg_mask_inds_shape (7, 2)
seg_mask_inds:  [[ 255.    1.]
 [ 255.    2.]
 [ 255.    3.]
 [ 255.    4.]
 [ 255.    5.]
 [ 255.    6.]
 [ 255.    7.]]
flipped:  [[ 0.]]
im_ind: 255.0
im_ind_shape: ()
flipped:  0.0
flipped_shape:  ()
im_ind:  255
seg path: ./data/cache/GTsegmask_VOC_2012_train/255_1_segmask.sm
mask_ims_shape: 1
mask_flipped_ims_shape: 1
im_ind:  255
seg path: ./data/cache/GTsegmask_VOC_2012_train/255_2_segmask.sm
mask_ims_shape: 2
mask_flipped_ims_shape: 2
im_ind:  255
k:  5
gt_mask_ind:  5
gt mask ind:  5
Traceback (most recent call last):
  File "./tools/train_net.py", line 116, in <module>
    max_iters=args.max_iters)
  File "/shenlab/lab_stor4/shujun/affordance-net-master14/tools/../lib/fast_rcnn/train.py", line 171, in train_net
    model_paths = sw.train_model(max_iters)
  File "/shenlab/lab_stor4/shujun/affordance-net-master14/tools/../lib/fast_rcnn/train.py", line 110, in train_model
    self.solver.step(1)
  File "/shenlab/lab_stor4/shujun/affordance-net-master14/tools/../lib/rpn/proposal_target_layer.py", line 329, in forward
    gt_mask = mask_ims[gt_mask_ind]
IndexError: list index out of range

nqanh commented 6 years ago

Thanks for the log! We're getting closer to the error.

The number of object in one image is from the .xml annotation file. How many objects do you have in the 255.xml file (from Annotations folder)? If there're 7 objects in this file, then for each object, you must have a .sm file --> you should prepare 7 mask files for each object.

In general, each object in the .xml file must have a groundtruth mask file. The first object in the .xml file should have 'IMGID_1_segmask.sm" mask, the second object in the .xml should have "IMGID_2_segmask.sm", etc.

lsj910128 commented 6 years ago

Thank you very much! It is really something wrong with my Annotations files. Now, I can train the model on my data sets.

lsj910128 commented 6 years ago

Hi, when I test the model, I meet the problem below:

File "/shenlab/lab_stor4/shujun/affordance-net-master14/tools/../lib/fast_rcnn/test.py", line 425, in test_net instance_mask[y1:y2+1, x1:x2+1] = mask TypeError: slice indices must be integers or None or have an index method

Then, I change the code into instance_mask[int(y1):int(y2+1), int(x1):int(x2+1)].

I test the model again, but I meet another problem below:

Traceback (most recent call last): File "./tools/test_net.py", line 90, in test_net(net, imdb, max_per_image=args.max_per_image, vis=args.vis) File "/shenlab/lab_stor4/shujun/affordance-net-master14/tools/../lib/fast_rcnn/test.py", line 425, in test_net instance_mask[int(y1):int(y2+1), int(x1):int(x2+1)] = mask ValueError: could not broadcast input array from shape (29,36,244) into shape (29,36)

But when I use python tools/demo_img.py to test the model, it is no problem.

nqanh commented 6 years ago

You should use python tools/demo_img.py to avoid other problems. Other files may contain only the tempo functions that were used in this file.

In general, during testing, we forward the image through the net, then get back all the boxes, masks, scores, etc, then we have to choose the box, select its associated mask, then resize this mask to the box size.

lsj910128 commented 6 years ago

Thanks!

Can I train and test the model on the Ubuntu.16.04?

nqanh commented 6 years ago

Sure, you can. I don't think we have any problems on Ubuntu 16.

lsj910128 commented 6 years ago

Hi, It is still some problem in testing model using python tools/demo_img.py. Below is the logs file:

image folder: /shenlab/lab_stor4/shujun/affordance-net-master14/tools/img list_test_img: ['4656.bmp', '4932.bmp', '4752.bmp', '4680.bmp', '4788.bmp', '4660.bmp', '4556.bmp', '4828.bmp', '4588.bmp', '4684.bmp', '4908.bmp', '4732.bmp', '4648.bmp', '4940.bmp', '4884.bmp', '4836.bmp', '4612.bmp', '4540.bmp', '4872.bmp', '4768.bmp', '4536.bmp', '4544.bmp', '4856.bmp', '4692.bmp', '4716.bmp', '4564.bmp', '4636.bmp', '4608.bmp', '4968.bmp', '4956.bmp', '4820.bmp', '4912.bmp', '4696.bmp', '4948.bmp', '4748.bmp', '4784.bmp', '4832.bmp', '4528.bmp', '4880.bmp', '4600.bmp', '4708.bmp', '4584.bmp', '4572.bmp', '4864.bmp', '4780.bmp', '4928.bmp', '4852.bmp', '4728.bmp', '4604.bmp', '4676.bmp', '4744.bmp', '4824.bmp', '4704.bmp', '4576.bmp', '4736.bmp', '4936.bmp', '4652.bmp', '4800.bmp', '4672.bmp', '4560.bmp', '4848.bmp', '4944.bmp', '4596.bmp', '4720.bmp', '4792.bmp', '4808.bmp', '4804.bmp', '4816.bmp', '4712.bmp', '4812.bmp', '4972.bmp', '4644.bmp', '4568.bmp', '4532.bmp', '4760.bmp', '4620.bmp', '4952.bmp', '4900.bmp', '4772.bmp', '4524.bmp', '4552.bmp', '4632.bmp', '4640.bmp', '4920.bmp', '4904.bmp', '4776.bmp', '4960.bmp', '4664.bmp', '4876.bmp', '4740.bmp', '4724.bmp', '4700.bmp', '4756.bmp', '4624.bmp', '4592.bmp', '4924.bmp', '4796.bmp', '4840.bmp', '4888.bmp', '4892.bmp', '4616.bmp', '4916.bmp', '4868.bmp', '4668.bmp', '4628.bmp', '4964.bmp', '4548.bmp', '4896.bmp', '4688.bmp', '4764.bmp', '4844.bmp', '4580.bmp', '4860.bmp'] idx: 0 ########################################################## Current idx: 0 / 113 Current img: 4656.bmp Detection took 0.422s for 2 object proposals idx: 1 ########################################################## Current idx: 1 / 113 Current img: 4932.bmp Detection took 0.220s for 1 object proposals idx: 2 ########################################################## Current idx: 2 / 113 Current img: 4752.bmp Detection took 0.388s for 6 object proposals

The problem is no matter what images I use to testing, the code always stuck on the third image and can't go on testing the forth image.

nqanh commented 6 years ago

Does it show any error? Also any results from the 1st and 2nd images? If it works for 1 image, then the rest is just the same. You just need to feed all the images to the net via a for loop.

lsj910128 commented 6 years ago

It doesn't show any error, and the results from the 1st, 2nd and 3rd images look good.

Below is the code:

print 'image folder: ', img_folder

list_test_img = os.walk(img_folder).next()[2]
print 'list_test_img: ', list_test_img

# run detection for each image
for idx, im_name in enumerate(list_test_img):
    print 'idx:', idx
    print '##########################################################'
    #im_name = im_name.strip()
    #im_name = '5'
    print 'Current idx: ', idx, ' / ', len(list_test_img)
    print 'Current img: ', im_name
    run_affordance_net(net, im_name)

And then the code stuck on the forth image, it can not testing the forth image. I wait for half an hour and the code still don't go on. (The index of images starts from 0)

image folder: /shenlab/lab_stor4/shujun/affordance-net-master14/tools/img list_test_img: ['4656.bmp', '4932.bmp', '4752.bmp', '4680.bmp', '4788.bmp', '4660.bmp', '4556.bmp', '4828.bmp', '4588.bmp', '4684.bmp', '4908.bmp', '4732.bmp', '4648.bmp', '4940.bmp', '4884.bmp', '4836.bmp', '4612.bmp', '4540.bmp', '4872.bmp', '4768.bmp', '4536.bmp', '4544.bmp', '4856.bmp', '4692.bmp', '4716.bmp', '4564.bmp', '4636.bmp', '4608.bmp', '4968.bmp', '4956.bmp', '4820.bmp', '4912.bmp', '4696.bmp', '4948.bmp', '4748.bmp', '4784.bmp', '4832.bmp', '4528.bmp', '4880.bmp', '4600.bmp', '4708.bmp', '4584.bmp', '4572.bmp', '4864.bmp', '4780.bmp', '4928.bmp', '4852.bmp', '4728.bmp', '4604.bmp', '4676.bmp', '4744.bmp', '4824.bmp', '4704.bmp', '4576.bmp', '4736.bmp', '4936.bmp', '4652.bmp', '4800.bmp', '4672.bmp', '4560.bmp', '4848.bmp', '4944.bmp', '4596.bmp', '4720.bmp', '4792.bmp', '4808.bmp', '4804.bmp', '4816.bmp', '4712.bmp', '4812.bmp', '4972.bmp', '4644.bmp', '4568.bmp', '4532.bmp', '4760.bmp', '4620.bmp', '4952.bmp', '4900.bmp', '4772.bmp', '4524.bmp', '4552.bmp', '4632.bmp', '4640.bmp', '4920.bmp', '4904.bmp', '4776.bmp', '4960.bmp', '4664.bmp', '4876.bmp', '4740.bmp', '4724.bmp', '4700.bmp', '4756.bmp', '4624.bmp', '4592.bmp', '4924.bmp', '4796.bmp', '4840.bmp', '4888.bmp', '4892.bmp', '4616.bmp', '4916.bmp', '4868.bmp', '4668.bmp', '4628.bmp', '4964.bmp', '4548.bmp', '4896.bmp', '4688.bmp', '4764.bmp', '4844.bmp', '4580.bmp', '4860.bmp'] idx: 0 ########################################################## Current idx: 0 / 113 Current img: 4656.bmp Detection took 0.422s for 2 object proposals idx: 1 ########################################################## Current idx: 1 / 113 Current img: 4932.bmp Detection took 0.220s for 1 object proposals idx: 2 ########################################################## Current idx: 2 / 113 Current img: 4752.bmp Detection took 0.388s for 6 object proposals

nqanh commented 6 years ago

Great to hear that it works! For other images, you may want to check if there are any interruptions in the code (e.g. cv2.waitKey(0)). It's also better to print the debug message to track and solve this kind of problem. I'm not sure what you changed in the code so can't help more.

lsj910128 commented 6 years ago

I comment out cv2.waitKey(0), then the code can run!

Thank you very much!

lsj910128 commented 6 years ago

Hi, I test all of the images and the results are good.

When I run python tools/demo_img.py, I want to generate a logs file. How can I do for that?

lsj910128 commented 6 years ago

I use python tools/demo_img.py 2>&1|tee xxx.log to generate a logs file.

nqanh / affordance-net

How to train on my dataset #8

@lsj910128: .sm files are actually binary files. You can load and see the groundtruth mask by the following python code: