kentaroy47 / frcnn-from-scratch-with-keras

:collision:Faster R-CNN from scratch written with Keras
Apache License 2.0
168 stars 106 forks source link

Anchor box generation when calculating the rpn #104

Open valentin-fngr opened 3 years ago

valentin-fngr commented 3 years ago

Hello :]

First of all, thank you very much for this awesome repository ! I would like to ask you a question about the calc_rpn function inside data_generators.py . This part is the most confusing to me :

for anchor_size_idx in range(len(anchor_sizes)):
        for anchor_ratio_idx in range(n_anchratios):
            anchor_x = anchor_sizes[anchor_size_idx] * anchor_ratios[anchor_ratio_idx][0]
            anchor_y = anchor_sizes[anchor_size_idx] * anchor_ratios[anchor_ratio_idx][1]   

            for ix in range(output_width):                  
                # x-coordinates of the current anchor box   
                x1_anc = downscale * (ix + 0.5) - anchor_x / 2
                x2_anc = downscale * (ix + 0.5) + anchor_x / 2  

                # ignore boxes that go across image boundaries                  
                if x1_anc < 0 or x2_anc > resized_width:
                    continue

                for jy in range(output_height):

                    # y-coordinates of the current anchor box
                    y1_anc = downscale * (jy + 0.5) - anchor_y / 2
                    y2_anc = downscale * (jy + 0.5) + anchor_y / 2

I have a hard time understanding the following line : downscale * (ix + 0.5) - anchor_x / 2 which repeats also for y1_anc, y2_anc.

  1. What are x1_anc, x2_anc, y1_anc, y2_anc ? Do they reprent the top-left, right-bottom corners of an anchor box ?
  2. What are each members inside the formula : downscale * (ix + 0.5) - anchor_x / 2. I understand that downscale is the stride (input_width / resized_width), but I don't understand how everything fit together.

Thank you so much for your time !

Valentin

VinayChauhan1996 commented 2 years ago

@valentin-fngr I also didn't understand above code. Can anyone help us with this code?