lzccccc / SMOKE

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
MIT License
696 stars 177 forks source link

About Evaluation on Val Split #89

Closed arwagh closed 9 months ago

arwagh commented 9 months ago

I have run the code following these configurations:

MODEL:
  WEIGHT: "/content/gdrive/MyDrive/SMOKE/tools/logs/model_final.pth" # # "catalog://ImageNetPretrained/DLA34" #
INPUT:
  FLIP_PROB_TRAIN: 0.5
  SHIFT_SCALE_PROB_TRAIN: 0.3
DATASETS:
  DETECT_CLASSES: ("Car", "Cyclist", "Pedestrian")
  TRAIN: ("kitti_train",)
  TEST: ("kitti_test",)
  TRAIN_SPLIT: "train"
  TEST_SPLIT: "val"
SOLVER:
  BASE_LR: 2.5e-4 # 2.5e-4 for DLA training
  STEPS: (10, 1000, 10000, 18000) # The original is 10000, 18000
  MAX_ITERATION: 25000 # 25000
  IMS_PER_BATCH: 1

running this command:

!python tools/plain_train_net.py --eval-only --config-file "configs/smoke_gn_vector.yaml"

and I got these results which are too much higher than the ones mentioned in the paper: car_detection AP: 99.253822 90.208725 90.030869 car_orientation AP: 99.235359 90.185669 89.979942 pedestrian_detection AP: 71.481392 71.233459 63.760303 pedestrian_orientation AP: 68.498894 67.344612 60.299564 cyclist_detection AP: 90.288994 90.166794 89.574661 cyclist_orientation AP: 89.407166 89.463783 88.851265 car_detection_ground AP: 83.296700 82.894707 75.743301 pedestrian_detection_ground AP: 43.819630 38.038414 36.839706 cyclist_detection_ground AP: 45.897942 52.835869 46.894695 car_detection_3d AP: 70.499840 68.271675 61.957222 cyclist_detection_3d AP: 35.539608 40.786041 39.733009 pedestrian_detection_3d AP: 40.907085 35.998760 30.820993

Also, the training finishes really fast and the scores are always exactly the same, for all models that are generated from different iterations, even if I try to train from scratch, I wonder what is the problem? can anyone help me please?

Thank you.

1gjjuser1 commented 1 month ago

I have run the code following these configurations:

MODEL:
  WEIGHT: "/content/gdrive/MyDrive/SMOKE/tools/logs/model_final.pth" # # "catalog://ImageNetPretrained/DLA34" #
INPUT:
  FLIP_PROB_TRAIN: 0.5
  SHIFT_SCALE_PROB_TRAIN: 0.3
DATASETS:
  DETECT_CLASSES: ("Car", "Cyclist", "Pedestrian")
  TRAIN: ("kitti_train",)
  TEST: ("kitti_test",)
  TRAIN_SPLIT: "train"
  TEST_SPLIT: "val"
SOLVER:
  BASE_LR: 2.5e-4 # 2.5e-4 for DLA training
  STEPS: (10, 1000, 10000, 18000) # The original is 10000, 18000
  MAX_ITERATION: 25000 # 25000
  IMS_PER_BATCH: 1

running this command:

!python tools/plain_train_net.py --eval-only --config-file "configs/smoke_gn_vector.yaml"

and I got these results which are too much higher than the ones mentioned in the paper: car_detection AP: 99.253822 90.208725 90.030869 car_orientation AP: 99.235359 90.185669 89.979942 pedestrian_detection AP: 71.481392 71.233459 63.760303 pedestrian_orientation AP: 68.498894 67.344612 60.299564 cyclist_detection AP: 90.288994 90.166794 89.574661 cyclist_orientation AP: 89.407166 89.463783 88.851265 car_detection_ground AP: 83.296700 82.894707 75.743301 pedestrian_detection_ground AP: 43.819630 38.038414 36.839706 cyclist_detection_ground AP: 45.897942 52.835869 46.894695 car_detection_3d AP: 70.499840 68.271675 61.957222 cyclist_detection_3d AP: 35.539608 40.786041 39.733009 pedestrian_detection_3d AP: 40.907085 35.998760 30.820993

Also, the training finishes really fast and the scores are always exactly the same, for all models that are generated from different iterations, even if I try to train from scratch, I wonder what is the problem? can anyone help me please?

Thank you.

I have a question for you. I also got similar output results as yours, but the indicators in the original paper include not only 3D but also Bev. My question is, how can we get the indicators about Bev?