Negative Loss for Pointpillar_MIMO_Var_C

Shangyi-Li commented 1 year ago

I tried to train a pointpillar_mimo_var_c model with following changes:

Without docker container, because my server use CUDA 11.3, pytorch 12.1
Did not use road_plane
use kitti dataset
random seed 3

and it rapidly reached negative loss.

The train parameters are default as follow:

       LOSS_CONFIG:
            CLF_LOSS_TYPE: SoftmaxFocalLossV2
            REG_LOSS_TYPE: VarRegLoss
            LOSS_WEIGHTS: {
                'cls_weight': 1.0,
                'loc_weight': 2.0,
                'dir_weight': 0.2,
                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
                'loc_l1_weight': 1.0,
                'loc_var_weight': 0.05
            }

    POST_PROCESSING:
        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
        SCORE_THRESH: 0.1
        OUTPUT_RAW_SCORE: False

        EVAL_METRIC: kitti

        NMS_CONFIG:
            MULTI_CLASSES_NMS: False
            NMS_TYPE: nms_gpu
            NMS_THRESH: 0.01
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500

OPTIMIZATION:
    BATCH_SIZE_PER_GPU: 4
    NUM_EPOCHS: 80

    OPTIMIZER: adam_onecycle
    LR: 0.003
    WEIGHT_DECAY: 0.01
    MOMENTUM: 0.9

    MOMS: [0.95, 0.85]
    PCT_START: 0.4
    DIV_FACTOR: 10
    DECAY_STEP_LIST: [35, 45]
    LR_DECAY: 0.1
    LR_CLIP: 0.0000001

    LR_WARMUP: False
    WARMUP_EPOCH: 1

    GRAD_NORM_CLIP: 10

Which I believe is exactly the same as the demo code and config files provided. Could you please help me? I wonder what parameters are proper and what did you use in the experiment in IEEE paper relevant. I wish you could spend a while to check this problem.

mpitropov commented 1 year ago

The variance losses are negative which made the total loss negative for me. Do you get similar results to this?

Shangyi-Li commented 1 year ago

The variance losses are negative which made the total loss negative for me. Do you get similar results to this?

So it's quite normal that the total loss is negative? Well, negative variance loss is really out of my expectation, so I have never trained more than 15 epochs with MIMO_var_C. I will check the loss curves after training and keep you informed if not bothering you. I am actually not familiar with this area, and I am still curious about why a negative loss works during neural network optimization.

Thanks for replying to me so immediately. Your MIMO is an amazing work in uncertainty evaluation on 3D detection which helps me a lot. I really appreacite that you publish your code.

mpitropov commented 1 year ago

Yes, it is normal. If you plug the equations into wolfram alpha and enter some reasonable values you can see that the result is most likely negative. You could also just output some values where I implemented the loss functions to check the implementation.

I'm not too sure how this affects the optimization. I remember many papers that had these loss functions for 3D object detection were closed-source. Maybe they found a different way to implement it and have a positive loss?

mpitropov / LiDAR-MIMO

Negative Loss for Pointpillar_MIMO_Var_C #1