SamsungLabs / tr3d

[ICIP2023] TR3D: Towards Real-Time Indoor 3D Object Detection
Other
153 stars 10 forks source link

Questions about Inference results #27

Open Poodlee opened 8 months ago

Poodlee commented 8 months ago

Hello. This time, I want to do 3D object detection of a cube with w, l, and h all of 0.04(m) through an rgb-d camera, so I am using this package. This is my first time doing this kind of work, Also, I'm still not good at English, so please understand that part. .

First, I create a data set and talk about the direction in which we proceed, and then ask questions at the end.

1. Understanding how sunrgbd data is processed and Make sunrgbd data

Label In the tools/data_converter/sunrgbd_data_utils.py file, there is following code.

class SUNRGBDInstance(object):

    def __init__(self, line):
        data = line.split(' ')
        data[1:] = [float(x) for x in data[1:]]
        self.classname = data[0]
        self.xmin = data[1]
        self.ymin = data[2]
        self.xmax = data[1] + data[3]
        self.ymax = data[2] + data[4]
        self.box2d = np.array([self.xmin, self.ymin, self.xmax, self.ymax])
        self.centroid = np.array([data[5], data[6], data[7]])
        self.width = data[8]
        self.length = data[9]
        self.height = data[10]
        # data[9] is x_size (length), data[8] is y_size (width), data[10] is
        # z_size (height) in our depth coordinate system,
        # l corresponds to the size along the x axis
        self.size = np.array([data[9], data[8], data[10]]) * 2
        self.orientation = np.zeros((3, ))
        self.orientation[0] = data[11]
        self.orientation[1] = data[12]
        self.heading_angle = np.arctan2(self.orientation[1],
                                        self.orientation[0])
        self.box3d = np.concatenate(
            [self.centroid, self.size, self.heading_angle[None]])

As mentioned in the code above, labeling consists of a total of 13 items, and the following label .txt files were created accordingly. At this time, I thought I would only use the point cloud, so I arbitrarily entered the value 1 1 2 2 for the 2d bbox. box 1 1 2 2 -0.024179 0.896166 0.111629 0.04 0.04 0.04 1.000000 0.000000

Depth In the case of the depth file, the x, y, z, r, g, and b values were entered in that order. At this time, r, g, and b values were assigned values between 0 and 1.

2. Train

I ran a train according to the code below.

python tools/train.py configs/tr3d/tr3d_sunrgbd-3d-10class.py

3. Inference

I ran a Inference according to the code below

python demo/pcd_demo.py data/sunrgbd/points/000960.bin configs/tr3d/tr3d_sunrgbd-3d-10class.py work_dirs/surgbd-data/latest.pth --score-thr 0.6 --show

4. Problem in inference

Difference between label z value and inferred z value +0 105 show_result In the label file and in reality, the z value is 0.11, but the inference result value is 0.0759. infer_bottom Also, if I perform inference with a cube on the floor, I will even get a negative number.

So, I visualized and confirmed the point cloud used in the dataset. As a result, if you look at the photo below, you can see that it has a shape with a center of approximately 0.11. pc vis

My question here is that the x and y coordinate values are pretty accurate, but I don't know why there is an error in the z value.

Yaw value image (9)

Train was performed and when Infer was performed with the same data, an incomprehensible yaw value was obtained. The result is as follows, mid and max on the left mean distance, and the value on the right means rad (degree).

max_clock_15 -> 0.08926178 (5.11˚) max_clock_30 -> -1.3506540 (-77.39˚) max_clock_45 -> 1.5099705 (86.51˚) max_clock_60 -> 1.34937334 (77.31˚) max_clock_75 -> 1.3324995 (76.35˚) max_straight -> 0.04222381 (2.42˚) mid_clock_15 -> 1.5214607 (87.17˚) mid_clock_30 -> 1.4977183 (85.81˚) mid_clock_45 -> 1.5112293 (86.59˚) mid_clock_60 -> 1.4744493 mid_clock_75 -> 1.5131842 mid_straight -> 1.4999024

There seems to be a problem with labeling. Can you explain in more detail the method when giving angle values within labeling?

If there is anything missing, please let me know. thank you

filaPro commented 8 months ago

Does this comment help you? For the yaw angle it says that yaw=0 when it is oriented as x axis. And the center of the box is not in its actual center, but in the center of its bottom face.

Poodlee commented 8 months ago

Thank you for your kind and quick response.

  1. z-value Since the z value consistently differed from the actual z by about 2 - 2.5cm, I was worried about whether I should readjust the origin, but that problem was resolved by telling me about the bottom face. Thank you!

  2. yaw I've seen the comment you mentioned before, but I'm still not sure about the results. Using that comment, I set the right part as the +x axis, increased it clockwise by 15 degrees, and changed the label value the same to proceed with the training. After learning, the result comes out like this, as mentioned above.

    result comes out like this, as mentioned above.
    mid,max: distance       result(radian,degree)        label
    max_clock_15             0.08926178 (5.11˚)           15˚
    max_clock_30            -1.3506540 (-77.39˚)          30˚
    max_clock_45             1.5099705 (86.51˚)           45˚
    max_clock_60             1.34937334 (77.31˚)          60˚
    max_clock_75             1.3324995 (76.35˚)           75˚
    max_straight             0.04222381 (2.42˚)           0˚
    mid_clock_15             1.5214607 (87.17˚)           15˚
    mid_clock_30             1.4977183 (85.81˚)           30˚
    mid_clock_45             1.5112293 (86.59˚)           45˚
    mid_clock_60             1.4744493 (84.48˚)           60˚
    mid_clock_75             1.5131842 (86.70˚)           75˚
    mid_straight             1.4999024 (85.89˚)           0˚

    (The cube below was trained by rotating it clockwise around the z-axis.) IMG_5761

Since it is a cube, is it because rotation of 60 degrees in the clock direction and rotation of 30 degrees in the counter-clock direction are indistinguishable? Or was there a problem with labeling?

filaPro commented 8 months ago

The prediction of angle may be not quite accurate. And also we don't make much difference between rotation on 90 degrees. You can see more details on angle question in FCAF3D paper.