changing anchor size in labels and the model config has no influence on model training and output visualization (Anchor generation)

Hello,

I changed the anchor size of the bounding boxes in second.yaml (renamed to Bernhard_second ) to 9.6 meters, 9.6 meters, 4 meters for every class e.g.:

        ANCHOR_GENERATOR_CONFIG: [
            {
                'class_name': 'Car',
                'anchor_sizes': [[9.6, 9.6, 4]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-1.78],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.6,
                'unmatched_threshold': 0.45
            },
            {
                'class_name': 'Pedestrian',
                'anchor_sizes': [[9.6, 9.6, 4]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            },
            {
                'class_name': 'Cyclist',
                'anchor_sizes': [[9.6, 9.6, 4]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            }
        ]

and and I changed the bounding boxes size in the training labels for every class the new labels look like thise e.g.:

Car 0.00 0 -1.50 150.55505 -276.74 1110.555 683.26 400 960 960 0.70 1.76 23.88 -1.48
Car 0.00 2 1.75 132.39502 -294.8 1092.395 665.2 400 960 960 0.24 1.84 66.37 1.76
Car 0.00 0 1.78 106.714966 -293.675 1066.715 666.325 400 960 960 -2.19 1.96 68.25 1.75
DontCare -1 -1 -10 243.64001 -304.96 1203.64 655.04004 400 960 960 -1000 -1000 -1000 -10
DontCare -1 -1 -10 290.52002 -312.25 1250.52 647.75 400 960 960 -1000 -1000 -1000 -10

then I retrained second on it!

But when running the visualization, one can clearly see that the bounding box sizes have not changed at all. They should be square but they are clearly not:

Obviously the bounding box size is not influenced by the config file and the training labels, how does one influence the bounding box size? Or have I done something fundamentally wrong:

this is how I ran the demo: python demo.py --cfg_file cfgs/kitti_models/Bernhard_second.yaml --ckpt ~/data/OpenPCDet/output/kitti_models/Bernhard_second/default/ckpt/checkpoint_epoch_80.pth --data_path ~/data/OpenPCDet/data/kitti/training/velodyne/000100.bin

Bernhard_second is my custom config file!

Thanks in advance!

Anchor Boxes are, as the name suggests, just anchors. The networks always regress the anchors in some way to the final bounding boxes, so maybe the network you trained is not really sensitive to the anchor box size and can still regress the bounding boxes correctly even though your anchors are much bigger.


Car 0.00 2 1.75 132.39502 -294.8 1092.395 665.2 400 960 960 0.24 1.84 66.37 1.76
Car 0.00 0 1.78 106.714966 -293.675 1066.715 666.325 400 960 960 -2.19 1.96 68.25 1.75
DontCare -1 -1 -10 243.64001 -304.96 1203.64 655.04004 400 960 960 -1000 -1000 -1000 -10
DontCare -1 -1 -10 290.52002 -312.25 1250.52 647.75 400 960 960 -1000 -1000 -1000 -10```

What exactly did you change in the annotations? How did they look before? It does not clear to me what you changed.

@MartinHahner Thanks for the quick answer,

The original labels looked for example like this:

Truck 0.00 0 -1.57 599.41 156.40 629.75 189.25 2.85 2.63 12.34 0.47 1.49 69.44 -1.56
Car 0.00 0 1.85 387.63 181.54 423.81 203.12 1.67 1.87 3.69 -16.53 2.39 58.49 1.57
Cyclist 0.00 3 -1.65 676.60 163.95 688.98 193.93 1.86 0.60 2.02 4.59 1.32 45.84 -1.55
DontCare -1 -1 -10 503.89 169.71 590.61 190.13 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 511.35 174.96 527.81 187.45 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 532.37 176.35 542.68 185.27 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 559.62 175.83 575.40 183.15 -1 -1 -1 -1000 -1000 -1000 -10

and were changed to this:

Truck 0.00 0 -1.57 134.57996 -307.175 1094.58 652.825 400 960 960 0.47 1.49 69.44 -1.56
Car 0.00 0 1.85 -74.28 -287.67 885.72 672.32996 400 960 960 -16.53 2.39 58.49 1.57
Cyclist 0.00 3 -1.65 202.78998 -301.06 1162.79 658.94 400 960 960 4.59 1.32 45.84 -1.55
DontCare -1 -1 -10 67.25 -300.08 1027.25 659.92004 400 960 960 -1000 -1000 -1000 -10
DontCare -1 -1 -10 39.580017 -298.79498 999.58 661.205 400 960 960 -1000 -1000 -1000 -10
DontCare -1 -1 -10 57.525024 -299.19 1017.525 660.81 400 960 960 -1000 -1000 -1000 -10
DontCare -1 -1 -10 87.51001 -300.51 1047.51 659.49 400 960 960 -1000 -1000 -1000 -10

using this script I wrote:

import os
import argparse
import numpy as np
#code reused from this script:
from pcdet.utils.object3d_kitti import Object3d #to phrase lines more efficiently

#weird way of getting a default path for the labels folder
w_dir=os.getcwd() #save current working_directory
os.chdir(os.path.dirname(os.getcwd()))
os.chdir("data/kitti/training")
#reset w_dir

#parser
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--labels_dir', type=str, default=os.getcwd()+'/label_2',
                help='specify the directory of the labels')
args = parser.parse_args()
os.chdir(w_dir)# reset working directory

#convert the box labels to the shape 9,6 * 9,6 * 4 Meters since those are our RPN  targets for now!
print("creating new 9.6 * 9.6 *4 meters bounding boxes...")
for i,filename in enumerate(os.listdir(args.labels_dir)):
    print(filename)
    #open source file:
    with open(os.path.join(args.labels_dir, filename), 'r') as s: # open in readonly mode

        #open target file:
        t=open(os.path.join(os.path.dirname(args.labels_dir), "extracted/"+filename),"w+")

        for line in s.readlines():
            label = line.strip().split(' ')
            data=[]

            data.append(label[0]) #label
            data.append(label[1])#truncation
            data.append(label[2])#occlusion
            data.append(label[3])#alpha

            #convert bbox2d to newer dimensions:
            left_top=np.array((float(label[4]), float(label[5])),dtype=np.float32)
            right_bottom=np.array((float(label[6]), float(label[7])),dtype=np.float32)
            center=((left_top+right_bottom)/2)
            left_top=center-np.array((480,480),dtype=np.float32)
            right_bottom=center+np.array((480,480),dtype=np.float32)
            data.append(str(left_top[0]))#bbox2d left(left_top)
            data.append(str(left_top[1]))#bbox2d top(left_top)
            data.append(str(right_bottom[0]))#bbox2d right(right_bottom)
            data.append(str(right_bottom[1]))#bbox2d bottom(right_bottom)

            #convert dimesions 3d:
            data.append("400")#self.h label[8]
            data.append("960")#self.w label[9]
            data.append("960")#self.l  label[10]

            #location stays identical:
            data.append(label[11])#x
            data.append(label[12])#y
            data.append(label[13])#z

            #rotation stays identical:
            data.append(label[14])#ry

            #copyy score if it exists:
            if label.__len__() == 16:
                data.append(label[15]) 

            #join data
            data= " ".join(data)
            #writing the data to the file in the extracted folder
            t.write(data + '\n')

        s.close()
        t.close()

Did you rerun python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml as described here after you have altered your annotations with the script above?

Thanks! ;) that was definitely something I have missed. Once I do it the network seems to use the adjusted labels. Although it can not train on them since my label changing script seems to have a problem. but that should be an easy fix!

@MartinHahner

Did you rerun python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml as described here after you have altered your annotations with the script above?

Doing this and fixing some bugs in my labels transformation script solved the problem for me. New script, in case someone wants to use it:

import os
import argparse
import numpy as np
#code reused from this script:
from pcdet.utils.object3d_kitti import Object3d #to phrase lines more efficiently

#weird way of getting a default path for the labels folder
w_dir=os.getcwd() #save current working_directory
os.chdir(os.path.dirname(os.getcwd()))
os.chdir("data/kitti/training")
#reset w_dir

#parser
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--labels_dir', type=str, default=os.getcwd()+'/label_2',
                help='specify the directory of the labels')
args = parser.parse_args()
os.chdir(w_dir)# reset working directory

#convert the box labels to the shape 9,6 * 9,6 * 4 Meters since those are our RPN  targets for now!
print("creating new 9.6 * 9.6 *4 meters bounding boxes...")
for i,filename in enumerate(os.listdir(args.labels_dir)):
    print(filename)
    #open source file:
    with open(os.path.join(args.labels_dir, filename), 'r') as s: # open in readonly mode

        #open target file:
        t=open(os.path.join(os.path.dirname(args.labels_dir), "extracted/"+filename),"w+")

        for line in s.readlines():
            label = line.strip().split(' ')
            data=[]

            data.append(label[0]) #label
            data.append(label[1])#truncation
            data.append(label[2])#occlusion
            data.append(label[3])#alpha

            #convert bbox2d to newer dimensions:
            left_top=np.array((float(label[4]), float(label[5])),dtype=np.float32)
            right_bottom=np.array((float(label[6]), float(label[7])),dtype=np.float32)
            center=((left_top+right_bottom)/2)
            left_top=center-np.array((480,480),dtype=np.float32)
            right_bottom=center+np.array((480,480),dtype=np.float32)

            #check if we do not violate the point cloud range:
            #[0, -40, -3, left bottom
            #70.4, 40, 1] right top
            data.append(str(max([left_top[0], -40])))#bbox2d left(left_top)
            data.append(str(min([left_top[1], 70.4])))#bbox2d top(left_top)
            data.append(str(max([right_bottom[0], 40])))#bbox2d right(right_bottom)
            data.append(str(min([right_bottom[1], 0])))#bbox2d bottom(right_bottom)

            #convert dimesions 3d:
            data.append("4.00")#self.h label[8]
            data.append("9.60")#self.w label[9]
            data.append("9.60")#self.l  label[10]

            #location stays identical:
            data.append(label[11])#x
            data.append(label[12])#y
            data.append(label[13])#z

            #rotation stays identical:
            data.append(label[14])#ry

            #copyy score if it exists:
            if label.__len__() == 16:
                data.append(label[15]) 

            #join data
            data= " ".join(data)
            #writing the data to the file in the extracted folder
            t.write(data + '\n')

        s.close()
        t.close()

not sue if those lines max([left_top[0], -40]) are really necessary but I added them anyways

Here are my beautiful square boxes:

Thank you very much for your quick help and for teaching me more about anchor boxes!

open-mmlab / OpenPCDet

changing anchor size in labels and the model config has no influence on model training and output visualization (Anchor generation) #891